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LEAFY CXJTYLEDONl GENES AND THEIR USES 

HELD OF THE INVENTION 
5 The present invention is directed to plant genetic engineering. In particular, 

it relates to new embryo-specific genes useful in improving agronomically unportant 
plants. 

BACKGROUND OF THE INVENTION 
10 Embryogenesis in higher plants is a critical stage of the plant life cycle in which 

the primary organs are established. Embryo development can be separated into two main 
phases: the early phase in which the primary body organization of the embiyo is laid down 
and the late phase which involves maturation, desiccation and dormancy. In the early phase, 
the symmetry of the embryo changes from radial to bilateral, giving rise to a hypocotyl with a 
15 shoot meristem surrounded by the two cotyledonary primordia at the apical pole and a root 
meristem at the basal pole. In the late phase, during maturation the embryo achieves its 
maximum size and the seed accumulates storage proteins and lipids. Maturation is ended by 
the desiccation stage in which the seed water content decreases rapidly and the embryo passes 
into metabolic quiescent state. Dormancy ends with seed germination, and development 
20 continues from the shoot and the root meristem regions. 

The precise regulatory mechanisms which control cell and organ differentiation 
during the initial phase of embryogenesis are largely unknown. The plant hormone abscisic 
acid (ABA) is thought to play a role during late embryogenesis, mainly in the maturation 
stage by inhibiting germination during embryogenesis (Black, M. (1991). In Abscisic Acid: 
25 Physiology and Biochemistry. W. J. Davies and H. G. Jones, eds. (Oxford: Bios Scientific 
Publishers Ltd.), pp. 99-124) Koomneef, M., and Karssen, C. M. (1994). In Arabidopsis, E. 
M. Meyerowitz and C. R. SommerviUe, eds. (Cold Spring Harbor: Cold Spring Harbor 
Laboratory Press), pp. 3 13-334). Mutations which effect seed development and are ABA 
insensitive have been identified m Arabidopsis and maize. The ABA insensitive (abi3) mutant 
30 of Arabidopsis and the viviparousl (vpl) mutant of maize are detected mainly during late 

embryogenesis (McCarty, et aL, (1989) Plartt Cell 1. 523-532 and Parcy et al. (1994) Plant 
Cell 6, 1567-1582). Both the VPl gene and the^/i genes have been isolated and were 
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found to share conserved regions (Giraudat, J. (1995) Current Opinion in Cell Biology 
7:232-238 and McCarty, D. R. (1995). Annu. Rev. Plant Physiol. Plant Mol. Biol. 
46:71-93). The VPl gene has been shown to function as a transcription activator 
(McCarty, et al.. (1991) Cell 66:895-906). It has been suggested that ABI3 has a similar 
5 function. 

Another class of embryo defective mutants involves three genes: LEAFY 
COTYLEDONl and 2 (LECl, LEC2) and FUSCA3 (FUS3). These genes are thought to 
play a central role in late embryogenesis (Baumlein, et al. (1994) Plant J. 6:379-387; 
Meinke. D. W. (1992) Science 258:1647-1650; Meinke et al.. Plant Cell 6:1049-1064; 
10 West et al., (1994) Plant Cell 6:1731-1745). Like the abi3 mutant, leafy cotyledon-type 
mutants are defective in late embryogenesis. In these mutants, seed morphology is altered, 
the shoot meristem is activated early, storage proteins are lacking and developing 
cotyledons accumulate anthocyanin. As with abi3 mutants, they are desiccation intolerant 
and therefore die during late embryogenesis. Nevertheless, the immamre mutants embryos 
15 can be rescued to give rise to mature and fertile plants. However, unlike abi3 when the 
immature mutants germinate they exhibit trichomes on the adaxial surface of the 
cotyledon. Trichomes are normally present only on leaves, stems and sepals, not 
cotyledons. Therefore, it is thought that the leafy cotyledon type genes have a role in 
specifying cotyledon identity during embryo development. 
20 Among the above mutants, the led mutant exhibits the most extreme 

phenotype during embryogenesis. For example, the maturation and postgermination 
programs are active simultaneously in the led mutant (West et al., 1994), suggesting a 
critical role for LECl in gene regulation during late embryogenesis. 

In spite of the recent progress in defining the genetic control of embryo 
25 development, further progress is required in the identification and analysis of genes 
expressed specifically in the embryo and seed. Characterization of such genes would 
allow for the genetic engineering plants with a variety of desirable traits. For instance, 
modulation of the expression of genes which control embryo development may be used to 
alter traits such as accumulation of storage proteins in leaves and cotyledons. 
30 Alternatively, promoters from embryo or seed-specific genes can be used to direct 

expression of desirable heterologous genes to the embryo or seed. The present invention 
addresses these and other needs. 
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SUMMARY OF THE INVENTION 
The present invention is based, in part, on the isolation and characterization of 
LECl genes. The invention provides isolated nucleic acid molecules comprising a LECl 
polynucleotide sequence, typically about 630 nucleotides in length, which specifically 
hybridizes to SEQ. ID. No. 1 under stringent conditions. The LEC/ polynucleotides of the 
invention can encode a LECl polypeptide of about 210 amino acids, typically as shown in 
SEQ. ID. No. 2. 

The nucleic acids of the invention may also comprise expression cassettes 
containing a plant promoter operably linked to the LECl polynucleotide. In some 
embodiments, the promoter is from a LECl gene, for instance, as shown in SEQ. ID. No. 3. 
The LECl polynucleotide may be linked to the promoter in a sense or antisense orientation. 

The invention also provides transgenic plants comprising an expression 
cassette containing a plant promoter operably linked to a heterologous LECl 
polynucleotide. The LECl may encode a LECl polypeptide or may be linked to the 
promoter in an antisense orientation. The plant promoter may be from any number of 
sources, including a LECl gene, such a as that shown in SEQ. ID. No. 3 or SEQ, ID. No. 4. 
The transgenic plant can be any desired plant but is often a member of the genus Brassica. 

Methods of modulating seed development in a plants are also provided. The 
methods comprise introducing into a plant an expression cassette containing a plant promoter 
operably linked to a heterologous LECl polynucleotide. The LECl may encode a LECl 
polypeptide or may be linked to the promoter in an antisense orientation. The expression 
cassette can be introduced into the plant by any number of means known in the art, including 

through a sexual cross. 

The invention further provides expression cassettes containing promoter 
sequences from LECl genes. The promoters of the invention can be characterized by their 
ability to specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to - 
1998 of SEQ. ID. No. 3. The promoters of the invention can be operably linked to a variety 
of nucleic acids, whose expression is to be targeted to embryos or seeds. Transgenic plants 
comprising the expression cassettes are also provided. 

The promoters of the invention can be used in methods of targeting expression 
of a desired polynucleotide to seeds. The methods comprise introducing into a plant an 
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expression cassette containing a LECl promoter operably linked to a heterologous 
polynucleotide sequence. 

Definitions 

The phrase "nucleic acid" refers to a single or double-stranded polymer of 
deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. Nucleic acids 
may also include modified nucleotides that permit correct read through by a polymerase 
and do not alter expression of a polypeptide encoded by that nucleic acid. 

The phrase "polynucleotide sequence" or "nucleic acid sequence" includes 
both the sense and antisense strands as either individual single strands or in the duplex. It 
includes, but is not limited to, self-replicating plasmids, chromosomal sequences, and 
infectious polymers of DNA or RNA. 

The phrase "nucleic acid sequence encoding" refers to a nucleic acid which 
directs the expression of a specific protein or peptide. The nucleic acid sequences include 
both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is 
translated into protein. The nucleic acid sequences include both the full length nucleic 
acid sequences as well as non-full length sequences derived from the full length sequences. 
It should be further understood that the sequence includes the degenerate codons of the 
native sequence or sequences which may be introduced to provide codon preference in a 
specific host cell. 

The term "promoter" refers to a region or sequence determinants located 
upstream or downstream from the start of transcription and which are involved in 
recognition and binding of RNA polymerase and other proteins to initiate transcription. A 
"plant promoter" is a promoter capable of initiating transcription in plant cells. Such 
promoters need not be of plant origin, for example, promoters derived from plant viruses, 
such as the CaMV35S promoter, can be used in the present invention. 

The term "plant" includes whole plants, plant organs (e.g. , leaves, stems, 
flowers, roots, etc.), seeds and plant cells and progeny of same. The class of plants which 
can be used in the method of the invention is generally as broad as the class of higher 
plants amenable to transformation techniques, including both monocotyledonous and 
dicotyledonous plants, as well as certain lower plants such as algae. It includes plants of a 
variety of ploidy levels, including polyploid, diploid and haploid. 
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A polynucleotide sequence is "heterologous to" an organism or a second 
polynucleotide sequence if it originates from a foreign species, or, if from the same 
species, is modified from its original form. For example, a promoter operably linked to a 
heterologous coding sequence refers to a coding sequence from a species different from 
that from which the promoter was derived, or, if from the same species, a coding sequence 
which is different from any naturally occurring allelic variants. As defined here, a 
modified LECl coding sequence which is heterologous to an operably Hnked LECl promoter 
does not include the T-DNA insertional mutants as described in West et ai. The Plant Cell 
6.1731-1745 (1994). 

A polynucleotide "exogenous to" an individual plant is a polynucleotide which 
is introduced into the plant by any means other than by a sexual cross. Examples of means by 
which this can be accomplished are described below, and include Agrobacterium-n\&d:\z.tQ<i 
transformation, biolistic methods, electroporation, in plan/a techniques, and the like. Such a 
plant containing the exogenous nucleic acid is referred to here as an R, generation transgenic 
plant. Transgenic plants which arise from sexual cross or by selfmg are descendants of 
such a plant. 

As used herein an "embryo-specific gene" or "seed specific gene" is a gene 
that is preferentially expressed during embryo development in a plant. For purposes of 
this disclosure, embryo development begins with the first cell divisions in the zygote and 
continues through the late phase of embryo development (characterized by maturation, 
desiccation, dormancy), and ends with the production of a mature and desiccated seed. 
Embryo-specific genes can be further classified as "early phase-specific" and "late phase- 
specific". Early phase-specific genes are those expressed in embryos up to the end of embryo 
morphogenesis. Late phase-specific genes are those expressed from maturation through to 
production of a mature and desiccated seed. 

A "LECJ polynucleotide" is a nucleic acid sequence comprising (or consisting 
of) a coding region of about 100 to about 900 nucleotides, sometimes from about 300 to 
about 630 nucleotides, which hybridizes to SEQ. ID. No. 1 under stringent conditions (as 
defined below), or which encodes a LECl polypeptide. LECJ polynucleotides can also be 
identified by their ability to hybridize under low stringency conditions (e.g., Tm -40''C) to 
nucleic acid probes having a sequence from position 1 to 81 in SEQ. ID. NO. 1 or from 
position 355 to 627 in SEQ. ID. NO. 1. 
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A "promoter from a LECl gene" or ''LECl promoter" will typically be about 
500 to about 2000 nucleotides in length, usually from about 750 to 1500. An exemplary 
promoter sequence is shown as nucleotides 1-1998 of SEQ. ID. No. 3. KLECl promoter 
can also be identified by its ability to direct expression in all, or essentially all, proglobular 
embryonic cells, as well as cotyledons and axes of a late embryo. 

A "LECl polypeptide" is a sequence of about 50 to about 210, sometimes 100 
to 150, amino acid residues encoded by a LECl polynucleotide. A full length LECl 
polypeptide and fragments containing a CCAAT binding factor (CBF) domain can act as a 
subunit of a protein capable of acting as a transcription factor in plant cells. LEC 1 
polypeptides are often distinguished by the presence of a sequence which is required for 
binding the nucleotide sequence: CCAAT. In particular, a short region of seven residues 
(MPIANVI) at residues 34-40 of SEQ. ID No. 3 shows a high degree of similarity to a region 
that has been shown to required for binding the CCAAT box. Similarly, residues 61-72 of 
SEQ. ID No. 3 (IQECVSEYISFV) is nearly identical to a region that contains a subunit 
interaction domain (Xing, etai. (1993) EMBO J. 12:4647-4655). 

As used herein, a homolog of a particular embryo-specific gene (e.g., SEQ. 
ID. No. 1) is a second gene in the same plant type or in a different plant type, which has a 
polynucleotide sequence of at least 50 contiguous nucleotides which are substantially identical 
(determined as described below) to a sequence in the first gene. It is believed that, in general, 
homologs share a common evolutionary past. 

A "polynucleotide sequence from" a particular embryo-specific gene is a 
subsequence or full length polynucleotide sequence of an embryo-specific gene which, when 
present in a transgenic plant, has the desired effect, for example, inhibiting expression of the 
endogenous gene driving expression of an heterologous polynucleotide. A fiiU length 
sequence of a particular gene disclosed here may contain about 95%, usually at least about 
98% of an entire sequence shown in the Sequence Listing, below. 

In the case of both expression of transgenes and inhibition of endogenous 
genes (e.g., by antisense, or sense suppression) one of skill will recognize that the inserted 
polynucleotide sequence need not be identical and may be "substantially identical" to a 
sequence of the gene from which it was derived. As explained below, these variants are 
specifically covered by this term. 
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In the case where the inserted polynucleotide sequence is transcribed and 
translated to produce a functional polypeptide, one of skill will recognize that because of 
codon degeneracy a number of polynucleotide sequences will encode the same polypeptide. 
These variants are specifically covered by the term "polynucleotide sequence from" a 
5 particular embryo-specific gene, such as LECL In addition, the term specifically includes 

sequences {e.g,, full length sequences) substantially identical (determined as described below) 
with a LECl gene sequence and that encode proteins that retain the function of a LECl 
polypeptide. 

In the case of polynucleotides used to inhibit expression of an endogenous 
10 gene, the introduced sequence need not be perfectly identical to a sequence of the target 
endogenous gene. The introduced polynucleotide sequence will typically be at least 
substantially identical (as determined below) to the target endogenous sequence. 

Two nucleic acid sequences or polypeptides are said to be "identical" if the 
sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same 
15 when aligned for maximum correspondence as described below. The term "complementary 
to" is used herein to mean that the sequence is complementary to all or a portion of a 
reference polynucleotide sequence. 

Optimal alignment of sequences for comparison may be conducted by the local 
homology algorithm of Smith and Waterman Add, APL, Math 2:482 (1981), by the 
20 homology alignment algorithm of Needle man and Wunsch J, Mol Biol 48:443 (1970), by 
the search for similarity method of Pearson and Lipman Proc, Nail Acad, ScL (aS,A.) 85: 
2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, 
BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics 
Computer Group (GCG), 575 Science Dr., Madison, Wl), or by inspection. 
25 "Percentage of sequence identity" is determined by comparing two optimally 

aligned sequences over a comparison window, wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions or deletions (i.e., gaps) as 
compared to the reference sequence (which does not comprise additions or deletions) for 
optimal alignment of the two sequences. The percentage is calculated by determining the 
30 number of positions at which the identical nucleic acid base or amino acid residue occurs in 
both sequences to yield the number of matched positions, dividing the number of matched 
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positions by the total number of positions in the window of comparison and multiplying the 
result by 100 to yield the percentage of sequence identity. 

The term "substantial identity" of polynucleotide sequences means that a 
polynucleotide comprises a sequence that has at least 80% sequence identity, preferably at 
least 85%, more preferably at least 90% and most preferably at least 95%, compared to a 
reference sequence using the programs described above (preferably BLAST) using standard 
parameters. One of skill will recognize that these values can be appropriately adjusted to 
determine corresponding identity of proteins encoded by two nucleotide sequences by taking 
into account codon degeneracy, amino acid similarity, reading frame positioning and the like. 
Substantial identity of amino acid sequences for these purposes normally means sequence 
identity of at least 40%, preferably at least 60%, more preferably at least 90%, and most 
preferably at least 95%. Polypeptides which are "substantially similar" share sequences as 
noted above except that residue positions which are not identical may differ by conservative 
amino acid changes. Conservative amino acid substitutions refer to the interchangeability of 
residues having similar side chains. For example, a group of amino acids having aliphatic 
side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having 
aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide- 
containing side chains is asparagine and glutamine; a group of amino acids having aromatic 
side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic 
side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur- 
containing side chains is cysteine and methionine. Preferred conservative amino acids 
substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, 
alanine-valine, aspartic acid-glutamic acid, and asparagine-glutamine. 

Another indication that nucleotide sequences are substantially identical is if 
two molecules hybridize to each other, or a third nucleic acid, under stringent conditions. 
Stringent conditions are sequence dependent and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5° C lower than the thermal melting 
point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the 
temperature (under defined ionic strength and pH) at which 50% of the target sequence 
hybridizes to a perfectly matched probe. Typically, stringent conditions will be those in which 
the salt concentration is about 0.02 molar at pH 7 and the temperature is at least about 60°C. 
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In the present invention, mRNA encoded by embryo-specific genes of the 
invention can be identified in Northern blots under stringent conditions using cDNAs of the 
invention or fragments of at least about 100 nucleotides. For the purposes of this disclosure, 
stringent conditions for such RNA-DNA hybridizations are those which include at least one 
5 wash in 0.2X SSC at 63 °C for 20 minutes, or equivalent conditions. Genomic DNA or 
cDNA comprising genes of the invention can be identified using the same cDNAs (or 
fragments of at least about 100 nucleotides) under stringent conditions, which for purposes of 
this disclosure, include at least one wash (usually 2) in 0.2X SSC at a temperature of at least 
about 50°C, usually about 55°C, for 20 minutes, or equivalent conditions. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a restriction map of the 7.4 kb genomic wild-type fragment shown 

in SEQ. ID. No. 4. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 
15 The present invention provides new embryo-specific genes useful in genetically 

engineering plants. Polynucleotide sequences from the genes of the invention can be used, for 
instance, to direct expression of desired heterologous genes in embryos (in the case of 
promoter sequences) or to modulate development of embryos or other organs {e.g., by 
enhancing expression of the gene in a transgenic plant). In particular, the invention provides a 
20 new gene from Arabidopsis referred to here as LECl. LECJ encodes polypeptides which 

subunits of a protein which acts as a transcription factor. Thus, modulation of the expression 
of this gene can be used to manipulate a number of useful traits, such as increasing or 
decreasing storage protein content in cotyledons or leaves. 

Generally, the nomenclature and the laboratory procedures in recombinant 
25 DNA technology described below are those well known and commonly employed in the art. 
Standard techniques are used for cloning, DNA and RNA isolation, amplification and 
purification. Generally enzymatic reactions involving DNA ligase, DNA polymerase, 
restriction endonucleases and the like are performed according to the manufacturer's 
specifications. These techniques and various other techniques are generally performed 
30 according to Sambrook et al. , Molecular Cloning - A Laboratory Manual, 2nd. ed.. Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989). 
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Isolation of nucleic acids of th e invention 

The isolation of sequences from the genes of the invention may be 
accomplished by a number of techniques. For instance, oligonucleotide probes based on the 
sequences disclosed here can be used to identify the desired gene in a cDNA or genomic 
DNA library from a desired plant species. To construct genomic libraries, large segments of 
genomic DNA are generated by random fragmentation, e.g. using restriction endonucleases, 
and are ligated with vector DNA to form concalemers that can be packaged into the 
appropriate vector. To prepare a library of embryo-specific cDNAs, mRNA is isolated from 
embryos and a cDNA library which contains the gene transcripts is prepared from the mRNA. 



The cDNA or genomic library can then be screened using a probe based upon 
the sequence of a cloned embryo-specific gene such as the polynucleotides disclosed here. 
Probes may be used to hybridize with genomic DNA or cDNA sequences to isolate 
homologous genes in the same or different plant species. 
15 Alternatively, the nucleic acids of interest can be amplified from nucleic acid 

samples using amplification techniques. For instance, polymerase chain reaction (PGR) 
technology to amplify the sequences of the genes directly from mRNA, from cDNA, from 
genomic libraries or cDNA libraries. PGR and other in vitro amplification methods may also 
be useftil, for example, to clone nucleic acid sequences that code for proteins to be expressed, 
20 to make nucleic acids to use as probes for detecting the presence of the desired mRNA in 
samples, for nucleic acid sequencing, or for other purposes. 

Appropriate primers and probes for identifying embryo-specific genes from 
plant tissues are generated from comparisons of the sequences provided herein. For a general 
overview of PGR see PCR Protocols: A Guide to Methods and Applications. (Innis, M, 
25 Gelfand, D., Sninsky, J. and White. T., eds.), Academic Press, San Diego (1 990). 

Appropriate primers for this purpose include, for instance: UP primer - 5' GGA ATT GAG 
GAA GAA GCG AAG CGG A 3" and LP primer - 5' LP primer - 5' GGT GTA GAG ATA 
GAA GAG TTT TCC TTA 3'. Alternatively, the following primer pairs can be used. 5' 
ATG AGG AGG TGA GTG ATA GTA GG 3' and 5' GGG AGA GAT GGT GGT TGC TGC 
TG 3- or 5' GAG ATA GAG AGG GAT GGT GGT TG 3' and 5' TGA GTT ATA GTG AGG 
ATA ATG GTG 3'. The amplifications conditions are typically as follows. Reaction 
components: 10 mM Tris-HGl, pH 8.3, 50 mM potassium chloride, 1.5 mM magnesium 
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chloride, 0.001% gelatin, 200 microM dATP, 200 microM dCTP, 200 microM dGTP, 200 
microM dTTP, 0.4 microM primers, and 100 units per ml Taq polymerase. Program: 96 C 
for 3 min., 30 cycles of 96 C for 45 sec, 50 C for 60 sec, 72 for 60 sec, followed by 72 C 
for 5 min. 

Polynucleotides may also be synthesized by well-known techniques as 
described in the technical literature. See, e.g., Carruthers et al. , Cold Spring Harbor Symp. 
Quant. Biol. 47:411-418 (1982), and Adams etal.J. Am. Chem. Sac. 105:661 (1983). 
Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or by 
adding the complementary strand using DNA polymerase with an appropriate primer 
sequence. 

Use of nucleic acids of the inven tion to inhibit Rene expression 

The isolated sequences prepared as described herein, can be used to prepare 
expression cassettes useful in a number of techniques. For example, expression cassettes of 
the invention can be used to suppress endogenous LECl gene expression. Inhibiting 
expression can be useful, for instance, in weed control (by transferring an inhibitory sequence 
to a weedy species and allowing it to be transmitted through sexual crosses) or to produce 
fruit with small and non-viable seed. 

A number of methods can be used to inhibit gene expression in plants. For 
instance, antisense technology can be conveniently used. To accomplish this, a nucleic acid 
segment from the desired gene is cloned and operably linked to a promoter such that the 
antisense strand of RNA will be transcribed. The expression cassette is then transformed into 
plants and the antisense strand of RNA is produced. In plant cells, it has been suggested that 
antisense RNA inhibits gene expression by preventing the accumulation of mRNA which 
encodes the enzyme of interest, see, e.g., Sheehy et al., Proc. Nat. Acad. Sci. USA, 
85:8805-8809 (1988), and Hiatt et al., U.S. Patent No. 4.801,340. 

The nucleic acid segment to be introduced generally will be substantially 
identical to at least a portion of the endogenous embryo-specific gene or genes to be 
repressed. The sequence, however, need not be perfectly identical to inhibit expression. The 
vectors of the present invention can be designed such that the inhibitory effect applies to other 
proteins within a family of genes exhibiting homology or substantial homology to the target 
gene. 
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For antisense suppression, the introduced sequence also need not be full length 
relative to either the primary transcription product or fully processed mRN A. Generally, 
higher homology can be used to compensate for the use of a shorter sequence. Furthermore, 
the introduced sequence need not have the same intron or exon pattern, and homology of 
non-coding segments may be equally effective. Normally, a sequence of between about 30 or 
40 nucleotides and about full length nucleotides should be used, though a sequence of at least 
about 100 nucleotides is preferred, a sequence of at least about 200 nucleotides is more 
preferred, and a sequence of at least about 500 nucleotides is especially preferred. 

Catalytic RN A molecules or ribozymes can also be used to inhibit expression 
of embryo-specific genes. It is possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester backbone at a specific location, 
thereby functionally inactivating the target RNA In carrying out this cleavage, the ribozyme 
is not itself altered, and is thus capable of recycling and cleaving other molecules, making it a 
true enzyme. The inclusion of ribozyme sequences within antisense RNAs confers 
15 RNA-cleaving activity upon them, thereby increasing the activity of the constructs. 

A number of classes of ribozymes have been identified. One class of 
ribozymes is derived from a number of small circular RNAs which are capable of self- 
cleavage and replication in plants. The RNAs replicate either alone (viroid RNAs) or with a 
helper virus (satellite RNAs). Examples include RNAs from avocado sunblotch viroid and the 
satellite RNAs from tobacco nngspot virus, lucerne transient streak virus, velvet tobacco 
mottle virus, solanum nodiflorum mottle virus and subterranean clover mottle virus. The 
design and use of target RNA-specific ribozymes is described in HaselofFet al. Nature, 

334:585-591 (1988). 

Another method of suppression is sense suppression. Introduction of 
expression cassettes in which a nucleic acid is configured in the sense orientation with respect 
to the promoter has been shown to be an effective means by which to block the transcr.pt.on 
of target genes. For an example of the use of this method to modulate expression of 
endogenous genes see, Napoli et al.. The Plant Cell 2:279-289 (1990), and U.S. Patents Nos. 

5,034,323. 5,231,020, and 5,283,184. 

Generally, where inhibition of expression is desired, some transcription of the 
introduced sequence occurs. The effect may occur where the introduced sequence contains 
no coding sequence se, but only intron or untranslated sequences homologous to 
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sequences present in the primary transcript of the endogenous sequence. The introduced 
sequence generally will be substantially identical to the endogenous sequence intended to be 
repressed. This minimal identity will typically be greater than about 65%, but a higher 
identity might exert a more effective repression of expression of the endogenous sequences. 
Substantially greater identity of more than about 80% is preferred, though about 95% to 
absolute identity would be most preferred. As with antisense regulation, the effect should 
apply to any other proteins within a similar family of genes exhibiting homology or substantial 
homology. 

For sense suppression, the introduced sequence in the expression cassette, 
needing less than absolute identity, also need not be fiill length, relative to either the primary 
transcription product or fully processed mRNA. This may be preferred to avoid concurrent 
production or some plants which are overexpressers. A higher identity in a shorter than full 
length sequence compensates for a longer, less identical sequence. Furthermore, the 
introduced sequence need not have the same intron or exon pattern, and identity of non- 
coding segments will be equally effective. Normally, a sequence of the size ranges noted 
above for antisense regulation is used. 

Another means of inhibiting LECl function in a plant is by creation of 
dominant negatives. In this approach, non-functional, mutant LECl polypeptides, which 
retain the ability to interact with wild-type subunits are introduced into a plant. Identification 
of residues that can be changed to create a dominant negative can be determined by published 
work examining interaction of different subunits of CBF homologs from different species 
(see, e.g., Sinha et aL, (1995). Proc. Nail Acad. Set USA 92:1624-1628.) 

Use of nucleic acids of the invention to enhance gene ex pression 

Isolated sequences prepared as described herein can also be used to prepare 
expression cassettes which enhance or increase endogenous LECl gene expression. Where 
overexpression of a gene is desired, the desired gene from a different species may be used to 
decrease potential sense suppression effects. Enhanced expression of LECl polynucleotides 
is useful, for example, to increase storage protein content in plant tissues. Such techniques 
may be particularly useful for improving the nutritional value of plant tissues. 

One of skill will recognize that the polypeptides encoded by the genes of the 
invention, like other proteins, have different domains which perform different functions. 
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Thus, the gene sequences need not be full length, so long as the desired functional domain of 
the protein is expressed. As explained above, LECl polypeptides share sequences with CBF 
proteins. The DNA binding activity, and, therefore, transcription activation function, of 
LECl polypeptides is thought to be modulated by a short region of seven residues 
5 (MPl ANVI) at residues 34-40 of SEQ. ID No. 2. Thus, the polypeptides of the invention will 
often retain these sequences. Modified protein chains can also be readily designed utilizing 
various recombinant DNA techniques well known to those skilled in the art and described 
for instance, in Sambrook et al., supra. Hydroxylamine can also be used to introduce single 
base mutations into the coding region of the gene (Sikorski, et al. (1991). Meth. Emymol. 
10 194: 302-3 18). For example, the chains can vary from the naturally occurring sequence at 
the primary structure level by amino acid substitutions, additions, deletions, and the like. 
These modifications can be used in a number of combinations to produce the final 

modified protein chain. 

Desired modified LECl polypeptides can be identified using assays to 
15 screen for the presence or absence of wild type LECl activity. Such assays can be based 
on the ability of the LECl protein to functionally complement the hap3 mutation in yeast. As 
noted above, it has been shown that homologs from different species functionally interact with 
yeast subunits of the CBF. (Sinha, et al. (1995). Proc. Natl Acad. Sc. USA 92:1624-1628); 
see. also. Becker, et al. (1991). Proc. Natl Acad. Sci. USA 88: 1968-1972). The reporter 
20 for this screen can be any of a number of standard reporter genes such as the lacZ gene 

encoding P-galactosidase that is fused with the regulatory DNA sequences and promoter of 
the yeast CYCl gene. This promoter is regulated by the yeast CBF. 

A plasmid containing the LECl cDNA clone is mutagenized in vitro according 
to techniques well known in the art. The cDNA inserts are excised from the plasmid and 
25 inserted into the cloning site of a yeast expression vector such as pYES2 (Invitrogen). The 

plasmid is introduced into hap3- yeast containing a lacZ reporter that is regulated by the yeast 
CBF such as pLG265UPl-lacZ (Guarente, etal, (1984) Cell 36: 317-321), Transformants 
are then selected and a filter assay is used to test colonies for (i-galactosidase activity. After 
confirming the results of activity assays, immunochemical tests using a LECl antibody are 
30 performed on yeast lines that lack P-galactosidase activity to identify those that produce 

stable LECl protein but lack activity. The mutant LECl genes are then cloned from the yeast 
and their nucleotide sequence determined to identify the nature of the lesions. 
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In other embodiments, the promoters derived from the LECl genes of the 
invention can be used to drive expression of heterologous genes in an embryo-specific or 
seed-specific manner, such that desired gene products are present in the embryo, seed, or 
fruit. Suitable structural genes that could be used for this purpose include genes encoding 
proteins useful in increasing the nutritional value of seed or fruit. Examples include genes 
encoding enzymes involved in the biosynthesis of antioxidants such as vitamin A, vitamin 
C, vitamin E and melatonin. Other suitable genes encoding proteins involved in 
modification of fatty acids, or in the biosynthesis of lipids, proteins, and carbohydrates. 
Still other genes can be those encoding proteins involved in auxin and auxin analog 
biosynthesis for increasing fruit size, genes encoding pharmaceutically useful compounds, 
and genes encoding plant resistance products to combat fungal or other infections of the 
seed. 

Typically, desired promoters are identified by analyzing the 5' sequences of 
a genomic clone corresponding to the embryo-specific genes described here. Sequences 
characteristic of promoter sequences can be used to identify the promoter. Sequences 
controlling eukaryotic gene expression have been extensively studied. For instance, 
promoter sequence elements include the TATA box consensus sequence (TATA AT), 
which is usually 20 to 30 base pairs upstream of the transcription start site. In most 
instances the TATA box is required for accurate transcription initiation. In plants, further 
upstream from the TATA box, at positions -80 to -100, there is typically a promoter 
element with a series of adenines surrounding the trinucleotide G (or T) N G. J. Messing 
et al., in Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, 
eds. (1983)). 

A number of methods are known to those of skill in the art for identifying and 
characterizing promoter regions in plant genomic DNA (see, e.g., Jordano, et al, Plant Cell, 
1 : 855-866 (1989); Bustos, et al, Plant Cell 1 :839-854 (1989); Green, et al, EMBO J. 7, 
4035-4044 (1988); Meier, et al. Plant Cell, 3, 309-316 (1991); and Zhang, et al. Plant 
Physiology 110: 1069-1079 (1996)). 

Preparation of recombinant vectors 

To use isolated sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 
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transforming a wide variety of higher plant species are well known and described in the 
technical and scientific literahire. See, for example, Weising et al. Ann. Rev. Genet. 
22:421-477 (1988). A DNA sequence coding for the desired polypeptide, for example a 
cDNA sequence encoding a full length protein, will preferably be combined with 
transcriptional and translational initiation regulatory sequences which will direct the 
transcription of the sequence from the gene in the intended tissues of the transformed 
plant. 

For example, for overexpression, a plant promoter fragment may be 
employed which will direct expression of the gene in all tissues of a regenerated plant. 
Such promoters are referred to herein as "constitutive" promoters and are active under 
most environmental conditions and states of development or cell differentiation. Examples 
of constitutive promoters include the cauliflower mosaic virus (CaMV) 35S transcription 
initiation region, the 1' - or T- promoter derived from T-DNA of Agrobacterium 
tumafacien.'!, and other transcription initiation regions from various plant genes known to 
those of skill. 

Alternatively, the plant promoter may direct expression of the 
polynucleotide of the invention in a specific tissue (tissue-specific promoters) or may be 
otherwise under more precise environmental control (inducible promoters). Examples of 
tissue-specific promoters under developmental control include promoters that initiate 
transcription only in certain tissues, such as fruit, seeds, or flowers. As noted above, the 
promoters from the LECl genes described here are particularly useful for directing gene 
expression so that a desired gene product is located in embryos or seeds. Other suitable 
promoters include those from genes encoding storage proteins or the lipid body membrane 
protein, oleosin. Examples of environmental conditions that may affect transcription by 
inducible promoters include anaerobic conditions, elevated temperature, or the presence of 
light. 

If proper polypeptide expression is desired, a polyadenylation region at the 
3 '-end of the coding region should be included. The polyadenylation region can be 
derived from the natural gene, from a variety of other plant genes, or from T-DNA. 

The vector comprising the sequences {e.g. , promoters or coding regions) 
from genes of the invention will typically comprise a marker gene which confers a 
selectable phenotype on plant cells. For example, the marker may encode biocide 
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resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosluforon or 
Basta. 

5 Production of transgenic plants 

DNA constructs of the invention may be introduced into the genome of the 
desired plant host by a variety of conventional techniques. For example, the DNA 
construct may be introduced directly into the genomic DNA of the plant cell using 
techniques such as electroporation and microinjection of plant cell protoplasts, or the DNA 

10 constructs can be introduced directly to plant tissue using ballistic methods, such as DNA 
particle bombardment. Alternatively, the DNA constructs may be combined v^ith suitable 
T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens 
host vector. The virulence functions of the Agrobacterium tumefaciens host will direct the 
insertion of the construct and adjacent marker into the plant cell DNA when the cell is 

15 infected by the bacteria. 

Microinjection techniques are known in the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al. Embo J. 3:2717-2722 (1984). 
Electroporation techniques are described in Fromm et al. Proc. Natl. Acad. Sci. USA 

20 82:5824 (1985). Ballistic transformation techniques are described in Klein et al. Nature 
327:70-73 (1987). 

Agrobacterium tumefaciens-medisLted transformation techniques, including 
disarming and use of binary vectors, are well described in the scientific literature. See, 
for example Horsch et al. Science 233:496-498 (1984), and Fraley et al. Proc. NatL Acad, 

25 Sci. USA 80:4803 (1983). 

Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype and thus the desired phenotype such as seedlessness. Such 
regeneration techniques rely on manipulation of certain phytohormones in a tissue culture 

30 growth medium, typically relying on a biocide and/or herbicide marker which has been 
introduced together with the desired nucleotide sequences. Plant regeneration from 
cultured protoplasts is described in Evans et al., Protoplasts Isolation and Culture, 
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Handbook of Plant Cell Culture, pp. 124-176, MacMillilan Publishing Company. New 
York, 1983; and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC 
Press', Boca Raton, 1985. Regeneration can also be obtained from plant callus, explants, 
organs, or parts thereof. Such regeneration techniques are described generally in Klee et 
5 al. Ann. Rev. of Plant Phys. 38:467-486 (1987). 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant. Thus, the invention has use over a broad range of plants, including 
species from the genera Asparagus. Atropa, Avena, Brassica. Citrus, Citrullus, Capsicum, 
Cucumis. Cucurbita, Daucus. Fragaria. Glycine, Gossypium. HeliarUhus, Heterocalhs, 
10 Hordeum, Hyoscyamus, Lactuca, Linum, Lolium, Lycopersicon, Malus, Manihot. 

Majorana, Medicago, Nicotiana, Oryza, Panieum. Pannesetum, Persea, Pisum, Pyrus, 
Prunus, Raphanus, Secale, Senecio, Sinapis, Solanum, Sorghum, Trigonella, Triiicum, 
Vitis, Vigna. and, Zea. The LECl genes of the invention are particularly useful in the 
production of transgenic plants in the genus Brassica. Examples include broccoli. 
15 cauliflower, brussel sprouts, canola, and the like. 

One of skill will recognize that after the expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into 
other plants by sexual crossing. Any of a number of standard breeding techniques can be 
used, depending upon the species to be crossed. 

20 

Example 1 

This example describes the isolation and characterization of an exemplary 

LECl gene. 

F,Y perime "«'^l Prnredures 
25 Plant Material 

A lecl-2 mutant was identified from a population Arahidopsis thaliana 
ecotype Wassilewskija (Ws-O) lines mutagenized with T-DNA insertions as described before 
(West et al 1994). The ahiS-S, fusS-S and lecl-1 mutants were generously provided by 
Peter McCourt, University of Toronto and David Meinke, Oklahoma State University. W.ld 
30 type plants and mutants were grown under constant light at 22''C. 

Double mutants were constructed by intercrossing the mutant lines lecl-l, 
lecl.2 abi3.3,fus3-3, and lec2. The genotype of the double mutants was verified through 
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backcrosses with each parental line. Double mutants were those who failed to complement 
both parent lines. Homozygous single and double mutants were generated by germinating 
intact seeds or dissected mature embryos before desiccation on basal media. 
Isolation and Sequence analysis of G enomic and cDNA Clones 
5 Genomic libraries of Ws-O wild type plants, lecl-1 and lecl-2 mutants were 

made in GEMl 1 vector according to the instructions of the manufacturer (Promega). Two 
silique-specific cDNA libraries (stages globular to heart and heart to young torpedo) were 
made in ZAPII vector (Stratagene). 

The genomic library o^ lecl-2 was screened using right and left T-DNA 
10 specific probes according to standard techniques. About 1 2 clones that cosegragate with the 
mutation, were isolated and purified and the entire DNAs were further labeled and used as 
probes to screen a southern blot containing wild type and lecJ-1 genomic DNA. One clone 
hybridized with plant DNA and was further analyzed. A 7. 1 kb Xhol fragment containing 
the left border and the plant sequence flanking the T-DNA was subcloned into 
15 pB!uescript-KS plasmid (Stratagene) to form ML7 and sequenced using a left border specific 
primer (5' GCATAGATGCACTCGAAATCAGCC 3'). The T-DNA organization was 
partially verified using southern analysis with T-DNA left and right borders and PBR322 
probes. The results suggested that the other end of the T-DNA is also composed of left 
border. This was confirmed by generating a PGR fragment using a genomic plant DNA 
20 primer (LP primerS' GCT CTA GAG ATA CAA CAC TTT TCC TTA 3') and a T-DNA left 
border specific primer (5' GCTTGGTAATAATTGTCATTAG 3') and sequencing. 

The EcoRI insert of ML7 was used to screen a wild type genomic library. 
Two overiapping clones were purified and a 7.4 EcoRI genomic fragment from the wild type 
DNA region was subcloned into pBluescript-KS plasmid making WT74. This fragment was 
25 sequenced (SEQ. ID. No. 4) and was used to screen lecI-1 genomic library and wild type 

silique-specific cDNA libraries. 8 clones from the lecl-1 genomic library were identified and 
analyzed by restriction mapping. 

From these clones the exact site of the deletion in lecl-1 was mapped and 
sequenced by amplifying a Xbp PGR fragment using primers (H21 - 5' H21 - 5' CTA AAA 
30 ACA TCT ACG GTT CA 3'; H 17 - 5' TTT GTG GTT GAG CGT TTG GC 3') flanking the 
deletion region in lea 1-1 genomic DNA. Clones were isolated from both cDNA libraries 
and partially sequenced. The sequence of the cDNA clones and the wild type genomic clone 
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matched exactly, confirming that both derived from the same locus. All hybridizations were 
performed under stringent conditions with 32P random prime probes (Stratagene). 

Sequencing was done using the automated dideoxy chain termination method 
(Applied Biosystems, Foster City, CA). Data base searches were performed at the National 
Center for Biotechnology Information by using the BLAST network service. Alignment of 
protem sequences was done using PILEUP program (Genetics Computer Group. Madison. 
Wl) 

p^lA ^nH RNA blot analvsis 

Genomic DNA was isolated from leaves by using the CTAB-containing buffer 
Dellaporta, ei al, (1983). PlantMol Biol. Reporter 1: 19-21. Two micrograms of DNA 
was digested with different restriction endonucleases, electrophoretically separated in 1% 
agarose gel, and transferred to a nylon membrane (Hybond N; Amersham). 

Total RNA was prepared from siliques, two days old seedlings, stems, leaves, 
buds and roots. Poly(A)+ RNA was purified from total RNA by oligo(dT) cellulose 
chromatography, and two micrograms of each Poly(A)+ RNA samples were separated m 1% 
denatured formaldehyde-agarose gel. Hybridizations were done under stringent cond.t.ons 
unless it specifies otherwise. Radioactive probes were prepared as described above, 
rnm plp.mentatio" nflerl mutants 

A 3 ,4 kb Bstyl fragment of genomic DNA (SEQ. ID. No. 3) containing 
sequences from 1.992 kb upstream of the ORF to a region 579 bp downstream from the poly 
A site was subcloned into the hygromycin resistant binary vector pBIB-Hyg. The LECl 
cDNA was placed under the control of the 35S promoter and the ocs polyadenylation s.gnals 
by inserting a PCR fragment spanning the entire coding region into the plasmid pART7. The 
entire regulatory fragment was then removed by digestion with NotI and transferred into the 
hygromycin resistant binary vector BJ49. The binary vectors were introduced into the 
Agrobacterium strain GV3101, and constructions were checked by re-isolation of the 
plasmids and restriction enzyme mapping, or by PCR. Transformation to homozygous lecl-1 
and lecl-2 mutants were done using the m planta transformation procedure (Bechtold, al., 
(1993) Comptes Rendus de VAcademie des Sciences Serie III Sciences de la Vie, 316: 
1 194-1 199 Dry seeds from led mutants were selected for transformants by their ability to 
germinate after desiccation on plates containing 5g/ml hygromycin. The transformed plants 
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were tested for the present of the transgene by PGR and by screening the sihques for the 
present of viable seeds. 
Tn Situ Hybridization 

Experiments were performed as described previously by Dietrich et al (1989) 
Plant Cell 1: 73-80. Sections were hybridized with LECl antisense probe. As a negative 
control, the LECl antisense probe was hybridized to seed sections of led mutants. In 
addition, a sense probe was prepared and reacted with the wild type seed sections. 

Results 

Genetic Interaction Between Leafy Cotvledon-Tvpe Mutants and abi3 

In order to understand the genetic pathways which regulate late embryogenesis 
we took advantage of three Arabidopsis mutants Iec2,fus3'3 and abi3-3 that cause similar 
defects in late embryogenesis to those of lecJ-J or lecI-2. These mutants are desiccation 
intolerant, sometimes viviparous and have activated shoot apical meristems. The l€c2 and 
fus3-3 mutants are sensitive to ABA and possess trichomes on their cotyledons and therefore 
can be categorized as leafy cotyledon-type mutants (Meinke et al., 1994). The abi3-3 
mutants belong to a different class of late embryo defective mutations that is insensitive to 
ABA and does not have trichomes on the cotyledons. 

The two classes of mutants were crossed to led -J and led -2 mutants to 
construct plants homozygous to both mutations. The led and lec2 mutations interact 
synergistically, resuUing in a double mutant which is arrested in a stage similar to the late 
heart stage, the double mutant embryo, however, is larger. The led or lec2 ^nd fus3-3 
double mutants did not display any epistasis and the resulting embryo had an intermediate 
phenotype. The lecJ/abi3-3 double mutants and lec2/abi3-3 double mutants were ABA 
insensitive and had a lec-like phenotype. There was no different between double mutants that 
consist of either lecl-1 or lecl-2. 

No epistasis was seen between the double mutants indicating that each of the 
above genes, the LEC-type and ABI3 genes, operate in different genetic pathways. 
LECl Functions Ear ly in Rmbrvogenesis 

The effects of led is not limited to late embryogenesis, it also has a role in 
early embryogenesis. The embryos of the lecl/lec2 double mutants were arrested in the early 
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stages of development, while the single mutants developed into mature embryos, suggesting 
that these genes act early during development. 

Further examination of the early stages of the single and double mutations 
showed defects in the shape, size and cell division pattern of the mutants suspensors. The 
suspensor of wild type embryo consists of a single file of six to eight cells, whereas the 
suspensors of the mutants are often enlarged and undergo periclinal divisions. Leafy 
cotyledon mutants exhibit suspensor anomalies at the globular or transition stage whereas 
wild type and abi3 mutant do not show any abnormalities. 

The number of anomalous suspensors increases as the embryos continue to 
develop. At the torpedo stage, the wild type suspensor cells undergo programmed cell death, 
but in the mutants secondary embryos often develop from the abnormal suspensors and, when 
rescued, give rise to twins. 

The. Organization nf the. T.KCl 1 ,ocus in Wild Type Plants apd led Mutants 

Two mutant alleles of the LECl gene have been reported, lecl-J and lecl-2 
(Meinke, 1992; West et al., 1994). Both mutants were derived from a population of plants 
mutagenized insertionally with T-DNA (Feldmann and Marks, 1987), although lecl-1 is not 
tagged. The lecl-2 mutant contains multiple T-DNA insertions. A specific subset of T-DNA 
fragments were found to be closely linked with the mutation. A genomic library lecl-2 
was screened using right and left borders T-DNA as probes. Genomic clones containing 
T-DNA fragments that cosegragate with the mutation were isolated and tested on southern 
blots of both wild type and lecl-1 plants. Only one clone hybridized with Arahidopsis DNA 
and also gave polymorphic restriction fragment in lecl-1. 

The lecl-1 polymorphism resulted from a small deletion, approximately 2 kb in 
length. Using sequences from the plant fragment flanking the T-DNA, the genomic wild type 
DNA clones and the lecl-1 genomic clones were isolated. An EcoRI fragment of 7.4 kb of 
the genomic wild type DNA that corresponded to the polymorphic restriction fragment in 
lecl-1 was ftirther analyzed and sequenced. The exact site of the deletion in lecl-l was 
identified using a PGR fragment that was generated by primers, within the expected borders 
of the deleted fragment, and sequencing. 

In the wild type genomic DNA that corresponded to the lecl-1 deletion, a 626 
bp ORF was identified. Southern analysis of wild type DNA and the two mutants DNA 
probed with the short DNA fragment of the ORF revealed that both the wild type and lecl-2 
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DNA contain the ORF while the lecl-l genomic DNA did not hybridize. The exact insertion 
site of the T-DNA in led -2 mutant was determined by PGR and sequencing and it was found 
that the T-DNA was inserted 1 15 bp upstream of the ORF's translational initiation codon in 
the 5* region of the gene. 

At the site of the T-DNA insertion a small deletion of 21 plant nucleic acids 
and addition of 20 unknown nucleic acids occurred. These results suggest that in led -2 the 
T-DNA interferes with the regulation of the ORF while in led-1 the whole gene is deleted. 
Thus, both lad alleles contain DNA disruptions at the same locus, confirming the identity of 
the LEG! locus. 

The led Mutants Can Be Complement bv Transformation 

To prove that the 7.4 kb genomic wild type fragment indeed contained the 
ORF of the LECl gene, we used a genomic fragment of 3395 bp (SEQ. ID. No. 3) within 
that fragment to transform homozygous led- 1 and led -2 plants. The clone consists of a 
3395 bp BstYl restriction fragment containing the gene and the promoter region. The 
translation start codon (ATG) of the polypeptide is at 1999 and the stop codon is at 2625 
(TGA). There are no introns in the gene. 

The transformed plants were selected on hygromycin plates and were tested to 
contain the wild type DNA fragment by PGR analysis. Both transgenic mutants were able to 
produce viable progeny, that were desiccation tolerant and did not posses trichomes on their 
cotyledons. We concluded that the 3.4 kb fragment can complement the led mutation and 
since there is only one ORF in the deleted 2 kb fragment in lecl-l we suggest that this ORF 
corresponds to the LECl gene. 
The LECl Gene is a Member of Ge ne Family 

In order to isolate the LECl gene two cDNA libraries of young siliques were 
screened using the 7.4 kb DNA fragment as a probe. Seventeen clones were isolated and after 
further analysis and partial sequencing they were all found to be identical to the genomic 
ORF. The cDNA contains 626 bp ORF specifying 208 amino acid protein (SEQ. ID. Nos. 1 
and 2). 

The LECl cDNA was used to hybridize a DNA gel blot containing Ws-O 
genomic DNA digested with three different restriction enzymes. Using low stringency 
hybridization we found that there is at least one more gene. This confirmed our finding of 
two more Arabidopsis ESTs that show homology to the LECl gene. 
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The LECJ ge-ne is Em hrvn Specific 

The led mutants are affected mostly during embryogenesis. Rescued mutants 
can give rise to homozygous plants that have no obvious abnormalities other than the 
presence of trichomes on their cotyledons and their production of defective progeny. 
Therefore, v^e expected the LECl gene to have a role mainly during embryogenesis and not 
during vegetative grov^h. To test this assumption Poly (A)+ RNA was isolated from siliques, 
seedling, roots, leaves, stems and buds of wild type plants and from siliques of led plants. 
Only one band was detected on northern blots using either the LECl gene as a probe or the 
7.4 kb genomic DNA fragment suggesting that there is only one gene in the genomic DNA 
fragment which is active transcriptionally. The transcript was detected only in siliques 
containing young and mature embryos and was not detected in seedlings, roots, leaves, stems 
and buds indicating that the LECl gene is indeed embryo specific. In addition, no RNA was 
detected in siliques of both alleles of led mutants confirming that this ORF corresponds to 
the LECl gene. 

PvprP^ ^inn Pattern n f thp. T .F.Cl Gene 

To study how the LECl gene specifies cotyledons identity, we analyzed its 
expression by in situ hybridization. We specifically focused on young developing embryos 
since the mutants abnormal suspensors phenotype indicates that the LECl gene should be 
active very early during development. 

During embryogenesis, the LECl transcript was first detected in proglobular 
embryos. The transcript was found in all cells of the proembryo and was also found in the 
suspensor and the endosperm. However, from the globular stage and on it accumulates more 
in the outer layer of the embryo, namely the protoderm and in the outer part of the ground 
meristem leaving the procambium without a signal. At the torpedo stage the signal was 
stronger in the cotyledons and the root meristem, and was more limited to the protoderm 
layer. At the bent cotyledon stage the signal was present throughout the embryo and at the 
last stage of development when the embryo is mature and filling the whole seed we could not 
detect the LECl transcript. This might be due to sensitivity limitation and may imply that if 
the LECl transcript is expressed at that stage it is not localized in the mature embryo, but 
rather spread throughout the embryo. 

T rri p^nf» «..nr.ode.s ^ Homnlog of CCA AT binding factor., 



_9837184A1J_> 



WQ 98/37184 PCTAJS98/02998 

. 25 

Comparison of the deduced amino acid sequence of LEG 1 to the GeneBank 
reveals significant similarity to a subunit of a transcription factor, the CCAAT box binding 
factor (CBF). CBFs are highly conserved family of transcription factors that regulate gene 
activity in eukaryotic organisms Mantvani, et al, . (1992). Nucl Acids Res, 20: 1087-1091. 
5 They arc hetero-oligomeric proteins that consist of between three to four non-homologous 
subunits. LECl was found to have high similarity to CBF-A subunit. This subunit has three 
domains; A and C which show no conservation between kingdoms and a central domain, B, 
which is highly conserved evolutionary. Similarly the LECl gene is composed of three 
domains The LF.Cl B domain shares between 75%-85% similarity and 55%-63% identity 

10 with diflcrcnl B domains that are found in organisms ranging from yeast to human. Within 
this central domain, two highly conserved amino acid segments are present. Deletion and 
mutagenesis analysis in the CBF-A yeast homolog hap3 protein demonstrated that a short 
region of seven residues (42-48) (LPIANVA) is required for binding the CCAAT box, while 
the subunit interaction domain lies in the region between residues 69-80 ( MQECVSEFISFV) 

15 (Xing et ah, supra). LECl protein shares high homology to those regions. 

DISCUSSION 

The led mutant belongs to the leafy cotyledon class that interferes mainly 
with the embryo program and therefore is thought to play a central regulatory role during 

20 embryo development. It was shown before that LECl gene activity is required to suppress 
germination during the maturation stage. Therefore, we analyzed the genetic interaction of 
homozygous double mutants of the different members of the leafy cotyledon class and the 
ahi3 mutant that has an important role during embryo maturation. All the five different 
combinations of the double mutants showed either an intermediate phenotype or an additive 

25 effect. No epistatic relationship among the four genes was found. These findings suggest 
that the different genes act in parallel genetic pathways. Of special interest was the double 
mutant led /led that was arrested morphologically at the heart stage, but continued to grow 
in that shape. This double mutant phenotype indicates that both genes LECl and LECl are 
essential for early morphogenesis and their products may interact directly or indirectly in the 

30 young developing embryo. 

The Role o^LECl in Embrvogenesis 
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One of the proteins that mediate CCAAT box function, is an heteromeric 
protein called CBF (also called NFY or CPl). CBF is a transcription activator that regulates 
constitutively expressed genes, but also participates in differential activation of developmental 
genes Wingender. E. (1993). Gene Regulation in Eukaryotes (New York. VCH Publishers). 
In mammalian cells, three subunits have been identified CBF-A, CBF-B and CBF-C and all of 
which are required for DNA binding. In yeast, the CBF homolog HAP activates the CYCl 
and other genes involved in the mitochondrial electron transport Johnson, et al., Proteins. 
Annu. Rev. Biochem. 58, 799-840. (1989). HAP consists of four subunits hap2, hap3, hap4 
and hap5. Only hap2, 3 and 5 are required for DNA binding. CBF-A, B and C show high 
similarity to the yeast hap3, 2 and 5, respectively. It was also reported that mammalian 
CBF-A and B can be functionally interchangeable with the corresponding yeast subunits 

(Sinhae/a/., supra.). 

The LECl gene encodes a protein that shows more then 75% similarity to the 
conserved region of CBF-A. CCAAT motifs are not common in plants' promoters and their 
role in transcription regulation is not clear. However, maize and Brassica homologs have 
been identified Search in the Arabidopsis GeneBank revealed several ESTs that show h.gh 
similarity to CBF-A, B and C. Accession numbers of CBF-A (HAP3) homologs: H37368, 
H76589, CBF-B (HAP2) homologs: T20769; CBF-C (HAP5) homologs: T43909. T44300. 
These findings and the pleiotropic affects of LECl suggest that LECl is a member of a 
heteromeric complex that functions as a transcription factor. 

The model suggests that LECl acts as transcription activator to several sets of 
genes, which keep the embryonic program on and repress the germination process. 
Defective LECl expression partially shuts down the embryonic program and as a result the 
cotyledons lose their embryonic characteristics and the germination program is active in the 
embryo. 

Example 2 

This example demonstrates that LECl is sufficient to induce embryonic 

pathways in transgenic plants. 

The phenotype oUecl mutants and the gene's expression pattern indicated 
that LECl functions specifically during embryogenesis. A LECl cDNA clone under the 
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control of the cauliflower mosaic virus 35S promoter was transferred into lecl-1 mutant 

plants in planta using standard methods as described above. 

Viable dry seeds were obtained from lecl-1 mutants transformed with the 

35S/LEC1 construct. However, the transformation efficiency was only approximately 0.6% 
5 of that obtained normally. In several experiments, half the seeds that germinated (12/23) 

produced seedlings with an abnormal morphology. Unlike wild type seedlings, these 

35S/LEC1 seedlings possessed cotyledons that remained fleshy and that failed to expand. 

Roots often did not extend or extended abnormally and sometimes greened. These seedlings 

occasionally produced a single pair of organs on the shoot apex at the position normally 
10 occupied by leaves. Unlike wild type leaves, these organs did not expand and did not possess 

trichomes. Morphologically, these leaf-like structures more closely resembled embryonic 

cotyledons than leaves. 

The other 35S/LECJ seeds that remained viable after drying produced plants 
that grow vegetatively. The majority of these plants (7) flowered and produced 100% led 

15 mutant seeds. Amplification experiments confirmed that the seedlings contained the 

transgene, suggesting that the 35S/LECJ gene was inactive in these T2 seeds. No vegetative 
abnormalities were observed in these plants with the exception that a few displayed defects in 
apical dominance. A few plants (2) were male sterile and did not produce progeny. One 
plant that produced progeny segregated 25% mutant Led" seeds that, when germinated 

20 before desiccation and grown to maturity, gave rise to 100% mutant seed, as expected for a 
single transgene locus. The other 75% of seeds contained embryos with either a wild type 
phenotype or a phenotype intermediate between led mutants and wild type. Only 25% of the 
dry seed from this plant germinated, and all seedlings resembled the embryo-like seedlings 
described above. Some seedlings continued to grow and displayed a striking phenotype. 

25 These 35S/LEC1 plants developed two types of structures on leaves. One type resembled 
embryonic cotyledons while the other looked like intact torpedo stage embryos. Thus, 
ectopic expression of LEG 1 induces the morphogenesis phase of embryo development in 
vegetative cells. 

Because many 35S/LEC1 seedlings exhibited embryonic characteristics, the 
30 seedlings were analyzed for expression of genes specifically active in embryos. Cruciferin A 
storage protein mRNA accumulated throughout the 35S/LEC1 seedlings, including the leaf- 
like structures. Proteins with sizes characteristic of 12S storage protein cruciferin 
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accumulated in these transgenic seedlings. Thus, 35S/LEC1 seedings displaying an emb^^o- 
like phenotype accumulated embi^o-specific mRNAs and proteins. LECl mRNA 
accumulated to a high level in these 35S/L£C7 seedlings in a pattern similar to early stage 
embryos but not in wild type seedlings. LECl is therefore sufficient to alter the fate of 
vegetative cells by inducing embryonic programs of development. 

The ability of LECl to induce embryonic programs of development in 
vegetative cells establishes the gene as a central regulator of embryogenesis. LECl is 
sufficient to induce both the seed maturation pathway as indicated by the induction of storage 
protein genes in the 35S/LEC1 seedlings. The presence of ectopic embryos on leaf surfaces 
and cotyledons at the position of leaves also shows that LECl can activate the embryo 
morphogenesis pathway. Thus, LECl regulates both early and late embryonic processes. 

The above examples are provided to illustrate the invention but not to limit its 
scope. Other variants of the invention will be readily apparent to one of ordinary skill in the 
art and are encompassed by the appended claims. All publications, patents, and patent, 
applications cited herein are hereby incorporated by reference. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: The Regents of the University of California 
(ii) TITLE OF INVENTION: Leafy Cotyledonl Genes and Their Uses 
(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fulbright U Jaworski , LLP 

(B) STREET: 865 S. Figueroa Street, 2 9th Floor 

(C) CITY: Los Angeles 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 90017-2571 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.30 

(vi) CURRENT APPLICATIOlsT DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Berliner, Robert 

(B) REGISTRATION NUMBER: 20,121 

(C) REFERENCE/DOCKET NUMBER: 5555-470 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (213) 892-9200 

(B) TELEFAX: (213) 680-4518 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 627 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 627 
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(D) OTHER INFORMATION: /product^ "LECl" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG ACC AGC TCA GTC ATA GTA GCC GGC GCC GGT GAC AAG AAC AAT GGT 
Sit ?hr s2r -Ser Val He Val Ala Gly Ala Gly Asp Lys Asn Asn Gly 



15 

1 



5 10 



ATC GTG GTC CAG CAG CAA CCA CCA TGT GTG GCT CGT GAG CAA GAC CAA 
Til val val Gin Gin Gin Pro Pro Cys Val Ala Arg Glu Gin Asp Gin 



20 



25 



85 



90 



ATG AGC AAG CTT GGG TTC GAT AAC TAC GTG GAC CCC CTC ACC GTG TTC 
Met ser Lys Leu Gly Phe Asp Asn Tyr Val Asp Pro Leu Thr Val Phe 
100 105 

ATT AAC CGG TAC CGT GAG ATA GAG ACC GAT CGT GGT TCT GCA CTT AGA 
ATT AAC CGG lAL ^ Leu Arg 

He Asn Arg Tyr Arg Glu He Glu Tnr Asp Arg ^ ^ 
115 120 



165 



170 



AAC GGG TCG TCG GGT CAA GAT GAA TCC AGT GTT GGT GGT GGC TCT TCG 
Asn Gly ser Ser Gly Gin Asp Glu Ser Ser Val Gly Gly Gly Ser Ser 

185 -^^^ 



180 



-™ r^r-rr TTT HAC CAT TAT GGT CAG TAT AAG 

TrT Trr ATT AAC GGA ATG CCG GCT TTi (jAL- umx x^x 

ser ser ill fly Met Pro Ala Phe Asp His Tyr Gly Gin Tyr Lys 

195 200 205 



TGA 



48 



96 



144 



TAC ATG CCA ATC GCA AAC GTC ATA AGA ATC ATG CGT AAA ACC TTA CCG 
15 Tyr Me? Pro He Ala Asn Val He Arg He Met Arg Lys Thr Leu Pro 

35 40 45 

TCT CAC GCC AAA ATC TCT GAC GAC GCC AAA GAA ACG ATT CAA GAA TGT 
ser Ss Sa Lys He Ser Asp Asp Ala Lys Glu Thr He Gin Glu Cys 
20 50 55 60 

GTC TCC GAG TAC ATC AGC TTC GTG ACC GGT GAA GCC AAC GAG CGT TGC 
Zl ser 111 Tyr He Ser Phe Val Thr Gly Glu Ala Asn Glu Arg Cys 

65 '^^ 

CAA CGT GAG CAA CGT AAG ACC ATA ACT GCT GAA GAT ATC CTT TGG GCT 288 
Arg Glu Gin Arg Lys Thr He Thr Ala Glu Asp He Leu Trp Ala 



192 



240 



336 



384 



432 



480 



GGT GAG CCA CCG TCG TTG AGA CAA ACC TAT GGA GGA AAT GGT ATT GGG 
Gly Glu Pro Pro Ser Leu Arg Gin Thr Tyr Gly Gly Asn Gly He Gly 
40 130 135 140 

TTT CAC GGC CCA TCT CAT GGC CTA CCT CCT CCG GGT CCT TAT GGT TAT 
Phe His G?y Pro Ser His Gly Leu Pro Pro Pro Gly Pro Tyr Gly Tyr 
145 150 155 

GGT ATG TTG GAC CAA TCC ATG GTT ATG GGA GGT GGT CGG TAC TAC CAA 528 
Gly Zt Leu Sp G^^ Ser Met Val Met Gly Gly Gly Arg Tyr Tyr Gin 



576 



624 



627 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Thr Ser Ser Vai He Val Ala Gly Ala Gly Asp Lys Asn Asn Gly 
1 5 10 15 

He Val Val Gin Gin Gin Pro Pro Cys Val Ala Arg Glu Gin Asp Gin 
20 25 30 

Tyr Met Pro He Ala Asn Val He Arg He Met Arg Lys Thr Leu Pro 
35 40 45 

Ser His Ala Lys He Ser Asp Asp Ala Lys Glu Thr He Gin Glu Cys 
50 55 60 

Val Ser Glu Tyr He Ser Phe Val Thr Gly Glu Ala Asn Glu Arg Cys 
65 70 75 80 

Gin Arg Glu Gin Arg Lys Thr He Thr Ala Glu Asp He Leu Trp Ala 
85 90 9S 

Met Ser Lys Leu Gly Phe Asp Asn Tyr Val Asp Pro Leu Thr Val Phe 
100 105 110 

He Asn Arg Tyr Arg Glu He Glu Thr Asp Arg Gly Ser Ala Leu Arg 
115 120 125 

Gly Glu Pro Pro Ser Leu Arg Gin Thr Tyr Gly Gly Asn Gly He Gly 
130 135 140 

Phe His Gly Pro Ser His Gly Leu Pro Pro Pro Gly Pro Tyr Gly Tyr 
145 150 155 160 

Gly Met Leu Asp Gin Ser Met Val Met Gly Gly Gly Arg Tyr Tyr Gin 
165 170 175 

Asn Gly Ser Ser Gly Gin Asp Glu Ser Ser Val Gly Gly Gly Ser Ser 
180 185 190 

Ser Ser He Asn Gly Met Pro Ala Phe Asp His Tyr Gly Gin Tyr Lys 
195 200 205 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 95 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AGATCCAAAA CAGGTCATGG ACTGGGCCGT AAACTCTATC CAAAATTCTT CATGTTTTTC 
CATCTTTCAA AAATCTTTAT CCACCATTCC ATTACTAGGG TGTTGGTTTT ATTTTATTTG 
TTGATTAATT ATGTATTAGA AAATGTAAAG CAATATTCAA TTGTAACATG CATCATCTAA 
CACCAATATC TTGTACTAAC CTTTTGTAAT TTTCCTATAA ACATTTTAAA AGGCTAATTT 



AAATAAAAAT TACAATAAAC GTGATAACTC ACTTTCGTAA CGCATATTTA TTCAAATATA 
CCAAAATTTA CCATTTTAAG TAAGAGAATC TTTTTAAAAT TAATTTTCAA TTTCATTAAT 
20 TAAGAAACAA AGAATTTACT GAAACCTATA TTTTATTAAA TTTTAATAAA ATATATGACT 

AAAATAACGT CACGTGAATC TTTCTCAGCC GTTCGATAAT CGAATACTTT ATTGACTAAG 
TATTTATTTA GAAAATTTTA AACAACACTT AATTTCTAGA AACAAAGAGA GCCTCATATG 
TATAAAAATC TTCTTCTTAT CTTTCTTTCT TTCTTAATAG TCTTTATTTT TACTTAATTA 
CTTTGGTAAT TTGTGAAAAA CACAACCAAT GAGAGAAGAG CAGTTTGACT GGCCACATAG 
CCAATGAGAC AAGCCAATGG GAAAGAGATA TAGAGACCTC GTAAGAACCG CTCCTTTGCC 
ATTTGTATCA TCTCTCTATA AAACCACTCA ACCATCAACC ^TCTTTGCA TGCAACAAAT 
CACTCAAATA ATTATTTTAT AAAGAACAAA AAAAAAAAGA CGGCAGAGAA ACAATGGAAC 
GTGGAGCTCC CTTCTCTCAC TATCAGCTAC CCAAATCCAT CTCTGGTAAT CTAAGTGGCT 
;,TTTGTATAC AGTATATACT TGCCTCCATG TATATTTATA TTCTCGTGAA AAATTGGAGA 
CATGCTTTAT GAATTTTATG AGACTTTGCA ACAACGAACG AGATGCTTTC TCTCTAGAAA 
TTTAAATTTA GATTTGTGAA GGTTTTGGGA ATGGCCCGGA GAAGACGATT TTATATATAC 
;.TGCATGCAA GAGTTTGATA TGTATATTGT TTCATCATGG CTGAGTCAAA GTTTTATCCA 
AATATTTCCA TGGTGTGGTA TTAGTTAAAC AAATCTCTCG TATGTGTCAT TGAATATACC 
CGTGCATGTA CCAGGAATGT TTTTGATTCT AAAAACGTTT TTTTCTTTGT TGTAACGGTT 
GAGTTTTTTT CTTCGTTTCA AAACGAGATT CTCGTTTGTC TCTTCCCTTG TCTAAAAACA 
TCTACGGTTC ATGTGATTCA AAAACACTAA AAAAATATAA ACTCATTTTT TTTTAATACT 
TAACATTTAA ACTATATATA TATATATATA TATATATATC TTATACTAGT CCCAAGTTTT 



40 



50 



;,GTGTGAGGT TTTTTTATTC AAAATCTATC AGTACATTTT TTGGAAAAGA ACTAAGTGAA 
;,TTTTCTCCA AATTTTCCTT TTACTATTGA TTTTTTAATT ACTGGATGTC ATTAACTTTA 



60 
120 
180 
240 



720 
780 
840 



1020 
1080 
1140 



1320 
1380 
1440 



1560 
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10 



15 



20 



25 



30 



35 



40 



50 



ATCTTTTGAT 


TCTTTCAACG 


TTTACCATTG 


GGAACCTTCA 


CATGAAATAA 


ATGTCTACTT 


1620 


TATTGAGTCA 


TACCTTCGTC 


AACATAAATT 


AATTGATGTT 


CTTCTCCAAA 


TTTTGAGTTT 


1680 


TTHGTTTTTC 

^ J. vJ JL J- J- i A 


T AAT AATC TT 


AACGAAAGCT 


TTTTGGTATA 


CATGTAAAAC 


GTAACGGCAA 


1740 


HAATCTGAAC 


AGTCTACTCA 


ACGGGGTCCA 


TAAGTCTAGA 


ATGTAGACCC 


CACAAACTTA 


1800 


CTCTTATCTT 


ATTGGTCCGT 


AACTAAGAAC 


GTGTCCCTCT 


GATTCTCTTG 


TTTTCTTCTA 


1860 


ATTAATTCGT 


ATCCTACAAA 


TTTT^TTATC 


ATTTCTACTT 


CAACTAATCT 


TTTTTTATTT 


1920 


CCTAAAGATT 


TCAATTTCTC 


TCTGTATTTT 


CTATGAACAG 


AATTGAACTT 


GGACCAGCAC 


1980 


AGCAACAACC 


CAACCCCAAT 


GACCAGCTCA 


GTCATAGTAG 


CCGGCGCCGG 


TGACAAGAAC 


2040 




TGGTCCAGCA 


GCAACCACCA 


TGTGTGGCTC 


GTGAGC7U\GA 


CCAATACATG 


2100 




APGTCATAAG 


AATCATGCGT 


AAAACCTTAC 


CGTCTCACGC 


CAAAATCTCT 


2160 


r« li. rr* a pp n p A 


AAG7VAACGAT 


TCAAGAATGT 


GTCTCCGAGT 


ACATCAGCTT 


CGTGACCGGT 


2220 




AGPGTTGCCA 

Vj \ j X J- ii* 


ACGTGAGCAA 


CGTAAGACCA 


TAACTGCTGA 


AGATATCCTT 


2280 


•Pi^nr: PT A Tf? A 

i V X .t-\ X U/A 


GCAAGCTTGG 


GTTCGATAAC 


TACGTGGACC 


CCCTCACCGT 


GTTCATTAAC 


2340 


pr2r;TAPPnTri 


AG AT AG AG AC 


CGATCGTGGT 


TCTGCACTTA 


GAGGTGAGCC 


ACCGTCGTTG 


2400 


AGACAAACCT 


ATGGAGGAT^ 


TGGTATTGGG 


TTT C ACCaCj L. L. 




PPTAPPTPPT 


2460 


CCGGGTCCTT 


ATGGTTATGG 


TATGTTGGAC 


CAATCCATGG 


TTATGGGAGG 


TGGTCGGTAC 


2520 


TACCTU^AACG 


GGTCGTCGGG 


TCAAGATGAA 


TCCAGTGTTG 


GTGGTGGCTC 


TTCGTCTTCC 


2580 


ATTAACGGAA 


TGCCGGCTTT 


TGACCATTAT 


GGTCAGTATA 


AGTGAAGAAG 


GAGTTATTCT 


2640 



TCATTTTTAT ATCTATTCAA AACATGTGTT TCGATAGATA TTTTATTTTT ATGTCTTATC 270 0 

AATAACATTT CTATATAATG TTGCTTCTTT AAGGTU^GT GTTGTATGTC AATACTTTAT 2760 

GAGAAACTGA TTTATATATG CAAATGATTG AATCCAAACT GTTTTGTGGA TTAAACTCTA 2 82 0 

TGCAACATTA TATATTTACA TGATCTAAAG GTTTTGTAAT TCAAAAGCTG TCATAGTTAG 2 88 0 

45 AAGATAACTA AACATTGTAG TAACCAAGTT TAATTTACTT TTTTGAGTTT ACATAACTAA 2 94 0 

CCAAGCCAAA AGGTTATAAA ATCTAAATTC GTTGAGTTGT CAAACTTCTG AAGATTGCTA 300 0 

TCCTCTTTGA GTTGCTTTCT TTTGGGTGCT TGAGTTTCAT TAGGCTGAGC TGACTCGTTG 3060 

CTCTCTAGTC TTTCATCTCT GTCTTTTCCA AGGATTCATA ACGTTGGTCG CTCTCTGTTT 312 0 

CTGCCTACAC TTCTTCAAGG GATCATTACT GAGGCT7VAGA GTTATUVGACC TGAACCATGG 318 0 

55 TTTTCTGTAA CTGGTTCAAG TTCATTCTCC GGTTATTGTG TGGTTATCTT TCGGTTAGAT 3 24 0 
TGAAACCCAT ATGTTTGCTC TGTTTCTTCT AGTTCCAAGT TTAATTTCCG GTTATTGTTT 



3300 
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25 



35 



45 



34 

GGCTTTTTA^ AAGTTTTTAA GGTCTATTCT ATGTAAAGAC TATTCTACGT ACGTACATTT 
ATCGCAAAAT TGAAAGATTA TAAAAAAAAT TGAAA 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 756 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



3360 
3395 



55 



XAAATTTTCT TTTTGTAACT TTTCTTTAGA TTTATTTACg' ANAAGAGAAA TATAAACGTC 
.TGCTAATAA AAAATGCATT ATTTTCTACC ATCTAGCTAG AATATTGATC AAGTCTTCAC 
GTTTTTTGTT TATCTCTTCT CTCATAGGCA TGTCCACAAA AGGGTAAGTT TTACTGGTTC 
^;^TATTGC ATGAGTACTA CTAAGCTCGT ATAGTTTGAT CTTACTATCA TTGCGATGAG 



60 



180 
240 
300 
360 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO -.4: 
AATTNACCCT CACTAAAGGG AACAAAAGCT GGGTACCGGG CCCCCCCTCG AGGTCGACGG 
TATCGATAAG CTTGATATCG AATTCGTGGC CATTAGACCC ATAACTATAT GACGATGTTA ^20 
^^GAGAAAAT AAATCATAAA TAAAATAAGA GTCCTTATCA ATAAACCTAA TTGGCTAATT 
TCAACCTCAA AGAGTAGTAG GAACAGGTAA GGTGAAGCCA AACAGCTCCT TTTACAGTTG 
GACCACTAGA GCTGATCTGG CATACAAAGT ATGCTTATTG GGCTGTCACG GCCCATCCGC 
AAAATGTCGT TGGTTACGAA GCATCCACGA CATAGACGGT GCCACATGTT AGAAAAGTGT 
TTCGGCGATC AAGATTGTGT CCACATCATT AGACGTCTGA ACTGTCCACG TGTCTATCAA 
;,GCTGGCGTC AAACATTACG TTTTCGTCGT TTGCGCCTCC TAGTTCACAC GTGCAACGAA 
CGCGTGCGAC GTATCAAAAT TGTTAATTTT AGCCATGTAT AAAGAATATC TACAAAATTA 
ACCTCAGGAA TATTTTTGTT TTTTCAATTG AGGCCATAAT ATACNTNCCG ATNGAAAAAT 
riTNCANCAT ATCNCTAATA TCAAAAAATT ATGATGTTAG TAAACGTAAA AAATTTACAC 
AAAATAANTT TCACAAAACT TANNGGGGAA ATTGGAACAA ANAAAAGACT GGTGAGTGAT 
^^GCGATGAT GGCCGGTGAA TCAGGTAGCC GTCCTACAAC GTGGTTGATT TTGAGCAAAC 
TCCTATCTAC TCTTCACACT ATTGGAAATC CCAAAATGTC GTCACACCAT AATAATGTGA 
ATTTTGTTAT GGAATTTGAG GGAAACAGTA GATATATGTT TCAACCAGTG AAAGTTACCC 
TCCTTTGGAC ATATCTACGA NAGTAGAAAG TAGAAACATT CACTAAACGT GACAACTTTA 



1080 
1140 
1200 
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GGTTGTTAGT TTGGAAGAAA TAAGGATTTA TGCAAATGGT AATCATTATG TCTGCTATTT 12 60 

AAGAAGTAAA TTATGATGCT TGTTGCGTGA ACATATT7VAA TTTGCGAAAA ATAAGCAAGG 13 20 
ATACACGAGA GAAGCTCAGA TATTCACGTA ACGATGTTTC ATCTCTTCTC ATTGAGGAAA * 13 80 

CATATGGCCA TGATATAGCT AATAAGCCTA CGGGATTGTC NTTTCAACGC CGAATCTACC 144 0 

AAACTGTTCC ATCTCTTATT ATATATAGTT TGGTTATTTA AGTAATTAGA TGCATCATAA 150 0 

TCTTTTTTTC TGCCAGTTGT AATGCAGATA AAAATATATT GGTTGTTCTA AGGATTGTTC 1560 

AAACGTGCAT GTGTACAAGT TATTATTTAT ATACTTTCAT CTACATGCGA TGCGTTATTT 162 0 

15 ATAATGATAA AACTAAGATT TTTAGTTAAA TTTAATAAAG AGCTTACGAG CTACAATTAA 168 0 

TTAGAAATGG TTGCTCAGAA ATCAGAATAC TATATATGAA AAAAGAAGTT GGTATACTTG 174 0 

AAAAAAGAAA AAACTACTTG AAAAGATGGT AAAAGATATA GAACGAGTAT ATATCTTACT 1800 

CAAGCACGAT AGAAGTTTGT ATCAAAACAT TGCGTTCCAA ACCAATGTTT GAAGATGGTC 186 0 

AAAGGTGCTA CTCATGATGT GGTGCGAAGA AGCTTACGAA AAATTCTGCA ATGAGAGATA 1920 

25 ACTTTATGGG CTGCTTGTTC AATATATTGA AAATCATGGT AGACAACACC AAACTCTCCT 1980 

TTACCAGAAG TCATATTTCC TTAACCTCAG AATAAGTAAA TCTTCTAGTT TATTATTTGA 2 04 0 



20 



30 


AAGTTGAGCG 


TATAATTGCA 


ATGAAACTTT 


TACCAATTCA 


CCGCCTCCTA 


ACTGAGTTGT 


2100 


TGTATTATCC 


TATCTCTTTA 


GCTATCCTTT 


CCTTGCTCTT 


GCTCCACCTG 


CATGTGGCCT 


2160 




CTTTATTTAT 


AATCTCTCTA 


GATTCTGCTA 


AAGATGTNTG 


TTCAAAATGG 


TTTATCTTTA 


2220 


35 


AGGGAAGCAA 


AGTGAATGGA 


AACATTTAAA 


GAAAAAAAAA 


ACTTTTAGCA 


GAGTTCCATG 


2280 




AGATTTCATA 


CTGATGATAA 


CT7VAAATAAT 


CTTATATGCG 


TAAGATTATT 


TTAGTTCTAA 


2340 


40 


ACTTCATTTT 


GAAATGAGAG 


GTCATTGGCC 


AGGAAAGATT 


CAATATTGGT 


TCTTTGTTAA 


2400 


TTCTCGTTGG 


TTTGTTTTTA 


GTATGGGCTA 


GATCCAAAAC 


AGGTCATGGA 


CTGGGCCGTA 


2460 




AACTCTATCC 


AAAATTCTTC 


ATGTTTTTCC 


ATCTTTCAAA 


AATCTTTATC 


CACCATTCCA 


2520 


45 


TTACTAGGGT 


GTTGGTTTTA 


TTTTATTTGT 


TGATTAATTA 


TGTATTAGAA 


AATGTAAAGC 


2580 




AATATTCAAT 


TGTAACATGC 


ATCATCTAAC 


ACCAATATCT 


TGTACTAACC 


TTTTGTAATT 


2640 


50 


TTCCTATAAA 


CATTTTAAAA 


GGCTAATTTA 


AATAAAAATT 


ACAATAAACG 


TGATAACTCA 


2700 


CTTTCGTAAC 


GCATATTTAT 


TCAAATATAC 


CAAAATTTAC 


CATTTTAAGT 


AAGAGAATCT 


2760 




TTTTAAAATT 


AATTTTCAAT 


TTCATTAATT 


AAGAAACAAA 


GAATTTACTG 


AAACCTATAT 


2820 


55 


TTTATTAAAT 


TTTAATAAAA 


TATATGACTA 


AAATAACGTC 


ACGTGAATCT 


TTCTCAGCCG 


2880 




TTCGATAATC 


GAATACTTTA 


TTGACTAAGT 


ATTTATTTAG 


AAAATTTTAA 


ACAACACTTA 


2940 
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3000 



ATTTCTAGAA ACAAAGAGAG CCTCATATGT ATAAAAATCT TCTTCTTATC TTTCTTTCTT 
TCTTAATAGT CTTTATTTTT ACTTAATTAC TTTGGTAATT TGTGAAAAAC ACAACCAATG 3060 
AGAGAAGAGC AGTTTGACTG GCCACATAGC CAATGAGACA AGCCAATGGG AAAGAGATAT 3120 
AGAGACCTCG JAAGAACCGC TCCTTTGCCA TTTGTATCAT CTCTCTATAA AACCACTCAA 3180 
CCATCAACCT NTCTTTGCAT GCAACAAATC ACTCAAATAA TTATTTTATA AAGAACAAAA 324 0 
AAAAAAAGAC GGCAGAGAAA CAATGGAACG TGGAGCTCCC TTCTCTCACT ATCAGCTACC 3300 
CAAATCCATC TCTGGTAATC TAAGTGGCTA TTTGTATACA GTATATACTT GCCTCCATGT 3360 
ATATTTATAT TCTCGTGAAA AATTGGAGAC ATGCTTTATG AATTTTATGA GACTTTGCAA 3420 
CAACGAACGA GATGCTTTCT CTCTAGAAAT TTAAATTTAG ATTTGTGAAG GTTTTGGGAA 3480 
TGGCCCGGAG AAGACGATTT TATATATACA TGCATGCAAG AGTTTGATAT GTATATTGTT 3540 
TCATCATGGC TGAGTCAAAG TTTTATCCAA ATATTTCCAT GGTGTGGTAT TAGTTAAACA 3600 
AATCTCTCGT ATGTGTCATT GAATATACCC GTGCATGTAC CAGGAATGTT TTTGATTCTA 3660 
AAAACGTTTT TTTCTTTGTT GTAACGGTTG AGTTTTTTTC TTCGTTTCAA AACGAGATTC 3720 
TCGTTTGTCT CTTCCCTTGT CTAAAAACAT CTACGGTTCA TGTGATTCAA AAACACTAAA 3780 
AAAATATAAA CTCATTTTTT TTTAATACTT AACATTTAAA CTATATATAT ATATATATAT 3 84 0 
ATATATATCT TATACTAGTC CCAAGTTTTA GTGTGAGGTT TTTTTATTCA AAATCTATCA 3900 
' GTACATTTTT TGGAAAAGAA CTAAGTGAAA TTTTCTCCAA ATTTTCCTTT TACTATTGAT 3960 
TTTTTAATTA CTGGATGTCA TTAACTTTAA TCTTTTGATT CTTTCAACGT TTACCATTGG 4020 
GAACCTTCAC ATGAAATAAA TGTCTACTTT ATTGAGTCAT ACCTTCGTCA ACATAAATTA 
ATTGATGTTC TTCTCCAAAT TTTGAGTTTT TGGTTTTTCT AATAATCTTA ACGAAAGCTT 
TTTGGTATAC ATGTAAAACG TAACGGCAAG AATCTGAACA GTCTACTCAA CGGGGTCCAT 
AAGTCTAGAA TGTAGACCCC ACAAACTTAC TCTTATCTTA TTGGTCCGTA ACTAAGAACG 4260 
TGTCCCTCTG ATTCTCTTGT TTTCTTCTAA TTAATTCGTA TCCTACAAAT TTAATTATCA 
TTTCTACTTC AACTAATCTT TTTTTATTTC CTAAAGATTT CAATTTCTCT CTGTATTTTC 
TATGAACAGA ATTGAACTTG GACCAGCACA GCAACAACCC AACCCCAATG ACCAGCTCAG 
TCATAGTAGC CGGCGCCGGT GACAAGAACA ATGGTATCGT GGTCCAGCAG CAACCACCAT 
GTGTGGCTCG TGAGCAAGAC CAATACATGC CAATCGCAAA CGTCATAAGA ATCATGCGTA 
AAACCTTACC GTCTCACGCC AAAATCTCTG ACGACGCCAA AGAAACGATT CAAGAATGTG 
TCTCCGAGTA CATCAGCTTC GTGACCGGTG AAGCCAACGA GCGTTGCCAA CGTGAGCAAC 



4080 
4140 
4200 



4320 
4380 
4440 
4500 
4560 
4620 
4680 
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GTAAGACCAT AACTGCTGAA GATATCCTTT GGGCTATGAG CAAGCTTGGG TTCGATAACT 4 74 0 

ACGTGGACCC CCTCACCGTG TTCATTAACC GGTACCGTGA GATAGAGACC GATCGTGGTT 4 8 00 

CTGCACTTAG AGGTGAGCCA CCGTCGTTGA GACAAACCTA TGGAGGAAAT GGTATTGGGT 4 86 0 

TTCACGGCCC ATCTCATGGC CTACCTCCTC CGGGTCCTTA TGGTTATGGT ATGTTGGACC 4 92 0 

AATCCATGGT TATGGGAGGT GGTCGGTACT ACCAAAACGG GTCGTCGGGT CAAGATGAAT 4 9 80 

CCAGTGTTGG TGGTGGCTCT TCGTCTTCCA TTAACGGAAT GCCGGCTTTT GACCATTATG 504 0 

GTCAGTATAA GTGAAGAAGG AGTTATTCTT CATTTTTATA TCTATTCAAA ACATGTGTTT 5100 

15 CGATAGATAT TTTATTTTTA TGTCTTATCA ATAACATTTC TATATAATGT TGCTTCTTTA 516 0 

AGGAAAAGTG TTGTATGTCA ATACTTTATG AGAAACTGAT TTATATATGC AAATGATTGA 522 0 

ATCCAAACTG TTTTGTGGAT TAAACTCTAT GCAACATTAT ATATTTACAT GATCTAAAGG 52 8 0 

20 

TTTTGTAATT CAAAAGCTGT CATAGTTAGA AGATAACTAA ACATTGTAGT AACCAAGTTT 534 0 

AATTTACTTT TTTGAGTTTA CATAACTAAC CAAGCCAAAA GGTTATAAAA TCTAAATTCG 54 0 0 

25 TTGAGTTGTC AAACTTCTGA AGATTGCTAT CCTCTTTGAG TTGCTTTCTT TTGGGTGCTT 54 6 0 

GAGTTTCATT AGGCTGAGCT GACTCGTTGC TCTCTAGTCT TTCATCTCTG TCTTTTCCAA 552 0 

GGATTCAT7U\ CGTTGGTCGC TCTCTGTTTC TGCCTACACT TCTTCAAGGG ATCATTACTG 558 0 

30 

AGGCTAAGAG TTAAAGACCT GAACCATGGT TTTCTGTAAC TGGTTCAAGT TCATTCTCCG 564 0 

GTTATTGTGT GGTTATCTTT CGGTTAGATT GAAAGCCATA TGTTTGCTCT GTTTCTTCTA 5700 

35 GTTCCAAGTT TAATTTCCGG TTATTGTTTG GCTTTTTAAA AGTTTTTAAG GTCTATTCTA 57 6 0 

TGTAAAGACT ATTCTACGTA CGTACATTTA TCGCAAAATT GAAAGATTAT AAAAAAAATT 582 0 

GAAAGATCCA AAGGATU^CCA ATAGATTAAA CTAAAATGTA GTATCCTTTT TATCATTTTA 5 880 

40 

GGCTATGTTT TCTTTTAAGA AAGCTTTGGT AGTTAACTCT GTTTAAAAGA AAAAAAAGAG 5 94 0 

ATGCATAAAT TAAATTTAAG TTTCTAGAAC TTTTGGATAA ACATATTAAG CTAAAGAAAT 6 0 00 

45 TAAACTAAAG GGCGT7VAATG CAAGCTTGTT ATGCGTTATT GAAAACATTA CCTCTAAATT 6 06 0 

AAATAGCCCA ATATTGAAAA CCTTAAGCTT CTTTGATCCC CTTAACTTGT TTGTCCACCA 612 0 

AGTATTAGTT CATCTCTTAA CACGGCAACT CGAAACGGCA CAATGGACAA ACATGGTCTT 6180 

50 

TCAAAAACCA CTTCCCAATA CATCCATCGT CAAACTCGTG GCCACATGGT AAGGTCACCA 6 24 0 

CTATTTCTCC CTTTTCAAAC TCCTCCAAAC T^^TTGTGCA CACACTGGCG TCAGAGTTGG 63 00 

55 ATTTCTTCTT ATTATTATAT ACTTTCCTTG CCT^AACGGTC AACCACAAAC TTATTTGCCG 63 60 

GTCTAATTAA CTCGATATTA TTGGTGGTCT CATCAAACGA GTCAATCCGA GGAGGAGGTG 642 0 
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GAACAATGAC TTTACAGTAC ATGTAAACTA ACGTAGCACA AACTGAAGAG TCTACCATAG 
AAATCGACTT ACAGATTCGT TCAGTGAGTT GAGAGTTAGC AATGTCAACA TATTGTTCGG 
AGAGCCCTGC TGAGTACAAC CATTCATTCA GTTTTTTCGA GTCATTAGGG TAGGAGGATA 
TGACACCTTC GTAGTCATTG TACGAGAGAA CGAAATTTGG TGGAAGACTA ATTGATGTGT 
CCGATCTTCG GGCACTTACG CAGATTTTGA ATGATCCAGC ATCTTGTGAT TTCGGTTTGA 
GGTCTATTTC GCCGCCAAAG GATATTTCCG CTTCCATAGC TATCAAAGAG AAAGAAAAAT 
AGTGAATCCA AGGTTTAGGG TTTCTTTTCT TTGTCTTNCT TATATATAGA GGCGCTAGAT 
TGTATTAAGG ATTATACATA TATATAAGTA ATTGCAATTT GTGAGTTTAT CCTTATTCAT 
TTTTAATTTT ATTTACCTTT ATTTAGTTGA TATTGTGTCC TTTTCCTAGG TAGCATTTCC 
TTCCATCTGT GTTAATTATT AGCATTTCCT TTCCTTTGTC TTATTTGCCT TTATTTCGTA 
GGAAGAAATC CTTTATGNAC CCCATCTTGG CTGAGAACTT GAGATGATTT TAAATCCTCA 
AAAATTATTC AATTTATGAT TTCGAAATTG ATATACACTT TATATTTTCT CCTAAAAAAC 
CATATTGTAC TAAGAAAAGT AGAAAACCAG ACTTTTTAAT ATGTTAGATT TTAATTGGGT 
TCTTAAAGTG TTTTAGCGTT TNACACCGGT TATTCTCCAA AATCCAAACT CTATAATTAT 
AGTTTTTAAG TATAAATTAA TCCGGTTGGC CCAATTAGTG GACCGTTTAA AGAGTAGACA 
CTTTTTTTTT TATATATCGA CTACCATAAA ACTTTAACGA TTAATATTTT TGGATAATAA 
GCGATCGTTT TGAGGCGTCC CAATTTTTTT TGTTTCTTTT TATATGAGAA ATGGGTTTAA 
GAAAAACTGC AATTTTGTCC ATAAAGCTAG TCAGAATTCC TGCAGCCCGG GGGATCCACT 
AGTTCTAGAG CGGCCGCCAC CGCGGTGGAG CTCCAATTCG CCCTATAGTG AGTCGTATTA 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Pro lie Ala Asn Val He 
1 5 

(2) INFORMATION FOR SEQ ID NO : 6 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

He Gin Glu Cys Val Ser Glu Tyr He Ser Phe Val 
15 10 

15 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
GGAATTCAGC AACAACCCAA CCCCA 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
GCTCTAGACA TACAACACTT TTCCTTA 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

30 (ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 
35 ATGACCAGCT CAGTCATAGT AGC 



55 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10 
GCCACACATG GTGGTTGCTG CTG 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
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( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAGATAGAGA CCGATCGTGG TTC 
(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TCACTTATAC TGACCATAAT GGTC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
GCATAGATGC ACTCGAAATC AGCC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
GCTTGGTAAT AATTGTCATT AG 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CTAAAAACAT CTACGGTTCA 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
TTTGTGGTTG ACCGTTTGGC 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17; 

30 Leu Pro lie Ala Asn Val Ala 

1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Met Gin Glu Cys Val Ser Glu Phe He Ser Phe Val 
1 5 10 

50 
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^^rHATTStrT.ATMEDIS. 

1 An isolated nucleic acid molecule comprising a LECl polynucleotide 
sequence, which polynucleotide sequence specifically hybridizes to SEQ. ID. No. 1 under 
5 stringent conditions. 

2. The isolated nucleic acid molecule of claim 1, wherein the LECl 
polynucleotide is between about 100 nucleotides and about 630 nucleotides in length. 

3 The isolated nucleic acid molecule of claim 1 , wherein the LECl 
polynucleotide is SEQ. ID. No. 1. 

4 The isolated nucleic acid molecule of claim 1 , wherein the LECl 
polynucleotide encodes a LECl polypeptide of between about 50 and about 210 amino acids. 

5. The isolated nucleic acid molecule of claim 4, wherein the LECl 
polypeptide has an amino acid sequence as shown in SEQ. ID. No. 2. 

6. The isolated nucleic acid molecule of claim 1 , further comprising a 
20 plant promoter operably linked to the LECl polynucleotide. 

7. The isolated nucleic acid molecule of claim 6, wherein the plant 

promoter is from a LECl gene. 

25 8. The isolated nucleic acid of claim 7, wherein the LECl gene is as 

shown in SEQ. ID. No. 3. 

9. The isolated nucleic acid of claim 7, wherein the LECl gene is as 
shown in SEQ. ID. No. 4. 

an 

1 0. The .solated nucleic acid of claim 7, wherein the LECl polynucleotide 
is linked to the promoter in an antisense orientation. 
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11. An isolated nucleic acid molecule comprising a LECl polynucleotide 
sequence, which polynucleotide sequence encodes LECl polypeptide of between about 50 
and about 210 amino acids. 

12. The isolated nucleic acid of claim 10, wherein the LECl polypeptide 
has an amino acid sequence as shown in SEQ. ID. No. 2. 

13. A transgenic plant comprising an expression cassette containing a plant 
promoter operably linked to a heterologous LECl polynucleotide that specifically hybridizes 
to SEQ. ID. No. 1 under stringent conditions. 

14. The transgenic plant of claim 12, wherein the heterologous LECl 
polynucleotide encodes a LECl polypeptide. 

15. The transgenic plant of claim 13, wherein the LECl polypeptide is 
SEQ. ID- No. 2. 

16. The transgenic plant of claim 12, wherein the heterologous LECl 
polynucleotide is linked to the promoter in an antisense orientation. 

17. The transgenic plant of claim 1 2, wherein the plant promoter is from a 

LECl gene. 

18. The transgenic plant of claim 16, wherein the LECl gene is as shown 
in SEQ. ID. No. 3. 

19. The transgenic plant of claim 12, which is a member of the genus 

Brassica. 

20. A method of modulating seed development in a plant, the method 
comprising introducing into the plant an expression cassette containing a plant promoter 
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operably linked to a heterologous LECl polynucleotide that specifically hybridizes to SEQ. 
ID. No. 1 under stringent conditions. 

21 The method of claim 19, wherein the heterologous LECl 
5 polynucleotide encodes a LECl polypeptide. 

22. The method of claim 20, wherein the LEG 1 polypeptide has an amino 
acid sequence as shown in SEQ. ED. No. 2. 

: 23 . The method of claim 1 9, wherein the heterologous LECl 

polynucleotide is linked to the promoter in an antisense orientation. 

24. The method of claim 1 9, wherein the heterologous LECl 
polynucleotide is SEQ. ID. No. 1 . 

15 

25 . The method of claim 1 9, wherein the plant promoter is from a LECl 

gene. 

26. The method of claim 19, wherein the LECl gene is as shown in SEQ. 

20 ID. No . 3. 

27. The method of claim 1 9, wherein the plant is a member of the genus 

Brassica. 

25 28. The method ofclaim 19, wherein the expression cassette is introduced 

into the plant through a sexual cross. 

29. An isolated nucleic acid molecule comprising a plant promoter that 
specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to 1998 of 
30 SEQ. ID. No. 3. 
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30. The isolated nucleic acid molecule of claim 28, wherein the plant 
promoter sequence consists essentially of nucleotides 1 to 1998 of SEQ. ID. No. 3. 

3 1 The isolated nucleic acid molecule of claim 28, wherein the plant 
promoter sequence is a subsequence of SEQ. ID. No. 4. 

32. The isolated nucleic acid molecule of claim 28, further comprising a 
polynucleotide sequence operably linked to the plant promoter sequence. 

33 The isolated nucleic acid of claim 30, wherein the polynucleotide 
sequence operably linked to the plant promoter sequence encodes a desired polypeptide. 

34. The isolated nucleic acid molecule of claim 28, wherein the 
polynucleotide sequence is linked to the promoter in an antisense orientation. 

35. A transgenic plant comprising an expression cassette containing a 
LECl promoter operably linked to a heterologous polynucleotide sequence, wherein the 
LECI promoter specifically hybridizes to SEQ. ID. No. 3 under stringent conditions. 

36. The transgenic plant of claim 3 3 , wherein the polynucleotide sequence 
encodes a desired polypeptide. 



3 7 The transgenic plant of claim 3 3 , wherein the heterologous 
polynucleotide sequence is linked to ihcLECl promoter in an antisense orientation. 

38. The transgenic plant of claim 3 3 , wherein the LEC 1 promoter is as 
shown in SEQ. ID. No. 3. 

39. The transgenic plant of claim 33, which is a member of the genus 

Brassica. 
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40. A method of targeting expression of a polynucleotide to a seed, the 
method comprising introducing into a plant an expression cassette containing a LECI 
promoter operably linked to a heterologous polynucleotide sequence, wherein the LECI 
promoter specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to 
1998 of SEQ. ID. No. 3. 

41. The method of claim 38, wherein the heterologous polynucleotide 
sequence encodes a desired polypeptide. 

42. The method of claim 38, wherein the heterologous polynucleotide 
sequence is linked to the promoter in an antisense orientation. 
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LEAFY COTYLEDON 1 GENES AND THEIR USES 

HELD OF THE INVENTION 
The present invention is directed to plant genetic engineering. In particular, 
it relates to new embryo-specific genes useful in improving agronomically important 
plants. 

BACKGROUND OF THE INVENTION 

Embryogenesis in higher plants is a critical stage of the plant life cycle in which 
the primary organs are established. Embryo development can be separated into two main 
phases: the early phase in which the primary body organization of the embryo is laid down 
and the late phase which involves maturation, desiccation and dormancy. In the early phase, 
the symmetry of the embryo changes from radial to bilateral, giving rise to a hypocotyl with a 
shoot meristem surrounded by the two cotyledonary primordia at the apical pole and a root 
meristem at the basal pole. In the late phase, during maturation the embryo achieves its 
maxunum size and the seed accumulates storage proteins and lipids. Maturation is ended by 
the desiccation stage in which the seed water content decreases rapidly and the embryo passes 
into metabolic quiescent state. Dormancy ends with seed germination, and development 
continues from the shoot and the root meristem regions. 

The precise regulatory mechanisms which control cell and organ differentiation 
during the initial phase of embryogenesis are largely unknown. The plant hormone abscisic 
acid (ABA) is thought to play a role during late embryogenesis, mainly in the maturation 
stage by inhibiting germination during embryogenesis (Black, M. (1991). In Abscisic Acid: 
Physiology and Biochemistry, W. J. Davies andH. G. Jones, eds. (Oxford: Bios Scientific 
Publishers Ltd.), pp. 99-124) Koomnee^ M., and Karssen, C. M. (1994). In Arabidopsis, E. 
M. Meyerowitz and C. R. SommerviUe, eds. (Cold Spring Harbor: Cold Spring Harbor 
Laboratory Press), pp. 3 13-334). Mutations which effect seed development and are ABA 
insensitive have been identified in Arabidopsis and maize. The ABA insensitive (abi3) mutant 
of Arabidopsis and the viviparousl (vpl) mutant of maize are deterted mainly during late 
embryogenesis (McCarty, et aL, (1989) Plant Cell 1, 523-532 and Parcy et al, (1994) Plant 
Cell 6, 1567-1582). Both the VPl gene and theitB/3 genes have been isolated and were 
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found to share conserved regions (Giraudat, J. (1995) Current Opinion in Cell Biology 
7:232-238 and McCarty, D. R. (1995). Anna, Rev, Plant Physiol. Plant MoL BioL 
46:71-93). The VPl gene has been shown to function as a transcription activator 
(McCarty, et aL, (1991) Cell 66:895-906). It has been suggested that ^5/5 has a similar 
5 function. 

Another class of embryo defective mutants involves three genes: LEAFY 
COTYLEDON! and 2 (LECJ, LEC2) and FUSCA3 (FUSS), These genes are thought to 
play a central role in late embryogenesis (Baumlein, et aL (1994) Plant J, 6:379-387; 
Meinke, D. W. (1992) Science 258:1647-1650; Meinke etal. Plant Cell 6:1049-1064; 
10 West et aL, (1994) Plant Cell 6:1731-1745). Like the abi3 mutant, leafy cotyledon-type 
mutants are defective in late embryogenesis. In these mutants, seed morphology is altered, 
the shoot meristem is activated early, storage proteins are lacking and developing 
cotyledons accumulate anthocyanin. As with abi3 mutants, they are desiccation intolerant 
and therefore die during late embryogenesis. Nevertheless, the immature mutants embryos 
15 can be rescued to give rise to mature and fertile plants. However, unlike abi3 when the 
immature mutants germinate they exhibit trichomes on the adaxial surface of the 
cotyledon. Trichomes are nomially present only on leaves, stems and sepals, not 
cotyledons. Therefore, it is thought that the leafy cotyledon type genes have a role in 
specifying cotyledon identity during embryo development. 
20 Among the above mutants, the led mutant exhibits the most extreme 

phenotype during embryogenesis. For example, the maturation and postgermination 
programs are active simultaneously in the led mutant (West et al., 1994), suggesting a 
critical role for LECl in gene regulation during late embryogenesis. 

In spite of the recent progress in defining the genetic control of embryo 
25 development, further progress is required in the identification and analysis of genes 
expressed specifically in the embryo and seed. Characterization of such genes would 
allow for the genetic engineering plants with a variety of desirable traits. For instance, 
modulation of the expression of genes which control embryo development may be used to 
alter traits such as accumulation of storage proteins in leaves and cotyledons. 
30 Alternatively, promoters from embryo or seed-specific genes can be used to direct 

expression of desirable heterologous genes to the embryo or seed. The present invention 
addresses these and other needs. 
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SUMMARY OF THE INVENTION 
The present invention is based, in part, on the isolation and characterization of 
LECl genes. The invention provides isolated nucleic acid molecules comprising a LECl 
polynucleotide sequence, typically about 630 nucleotides in length, which specifically 
hybridizes to SEQ. ID. No. 1 under stringent conditions. i:\iQLECl polynucleotides of the 
invention can encode a LECl polypeptide of about 210 amino acids, typically as shown in 
SEQ. ID. No. 2. 

The nucleic acids of the invention may also comprise expression cassettes 
containing a plant promoter operably linked to the LECl polynucleotide. In some 
embodiments, the promoter is from a LECl gene, for instance, as shown in SEQ. ID. No. 3. 
The LECl polynucleotide may be linked to the promoter in a sense or antisense orientation. 

The invention also provides transgenic plants comprising an expression 
cassette containing a plant promoter operably linked to a heterologous LECl 
polynucleotide. The LECl may encode a LECl polypeptide or may be linked to the 
promoter in an antisense orientation. The plant promoter may be from any number of 
sources, including a LECl gene, such a as that shown in SEQ. ID. No. 3 or SEQ, ID. No. 4. 
The transgenic plant can be any desired plant but is often a member of the genus Brassica, 

Methods of modulating seed development in a plants are also provided. The 
methods comprise introducing into a plant an expression cassette containing a plant promoter 
operably linked to a heterologous LECl polynucleotide. The LECl may encode a LECl 
polypeptide or may be linked to the promoter in an antisense orientation. The expression 
cassette can be introduced into the plant by any number of means known in the art, including 

through a sexual cross. 

The invention further provides expression cassettes containing promoter 
sequences from LECl genes. The promoters of the invention can be characterized by their 
ability to specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to - 
1998 of SEQ. ID. No. 3. The promoters of the invention can be operably linked to a variety 
of nucleic acids, whose expression is to be targeted to embryos or seeds. Transgenic plants 
comprising the expression cassettes are also provided. 

The promoters of the invention can be used in methods of targeting expression 
of a desired polynucleotide to seeds. The methods comprise introducing into a plant an 
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expression cassette containing a LECl promoter operably linked to a heterologous 
polynucleotide sequence. 

Definitions 

5 The phrase "nucleic acid" refers to a single or double-stranded polymer of 

deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. Nucleic acids 
may also include modified nucleotides that permit correct read through by a polymerase 
and do not alter expression of a polypeptide encoded by that nucleic acid. 

The phrase "polynucleotide sequence" or "nucleic acid sequence" includes 

10 both the sense and antisense strands as either individual single strands or in the duplex. It 
includes, but is not limited to, self-replicating plasmids, chromosomal sequences, and 
infectious polymers of DNA or RNA. 

The phrase "nucleic acid sequence encoding" refers to a nucleic acid which 
directs the expression of a specific protein or peptide. The nucleic acid sequences include 

15 both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is 
translated into protein. The nucleic acid sequences include both the full length nucleic 
acid sequences as well as non-full length sequences derived from the full length sequences. 
It should be further understood that the sequence includes the degenerate codons of the 
native sequence or sequences which may be introduced to provide codon preference in a 

20 specific host cell. 

The term "promoter" refers to a region or sequence determinants located 
upstream or downstream from the start of transcription and which are involved in 
recognition and binding of RNA polymerase and other proteins to initiate transcription. A 
"plant promoter" is a promoter capable of initiating transcription in plant cells. Such 
25 promoters need not be of plant origin, for example, promoters derived from plant viruses, 
such as the CaMV35S promoter, can be used in the present invention. 

The term "plant" includes whole plants, plant organs (e.g.., leaves, stems, 
flowers, roots, etc.), seeds and plant cells and progeny of same. The class of plants which 
can be used in the method of the invention is generally as broad as the class of higher 
30 plants amenable to transformation techniques, including both monocotyledonous and 

dicotyledonous plants, as well as certain lower plants such as algae. It includes plants of a 
variety of ploidy levels, including polyploid, diploid and haploid. 
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A polynucleotide sequence is "heterologous to" an organism or a second 
polynucleotide sequence if it originates from a foreign species, or, if from the same 
species, is modified from its original form. For example, a promoter operably linked to a 
heterologous coding sequence refers to a coding sequence from a species different from 
that from which the promoter was derived, or, if from the same species, a coding sequence 
which is different from any naturally occurring allelic variants. As defined here, a 
modified LECl coding sequence which is heterologous to an operably linked LECI promoter 
does not include the T-DNA insertional mutants as described in West et al. The Plant Cell 
6:1731-1745 (1994). 

A polynucleotide "exogenous to" an individual plant is a polynucleotide which 
is introduced into the plant by any means other than by a sexual cross. Examples of means by 
which this can be accomplished are described below, and include Agrobacierium-m^d^iBX^A 
transformation, biolistic methods, electroporation, inplanta techniques, and the like. Such a 
plant containing the exogenous nucleic acid is referred to here as an Rj generation transgenic 
plant. Transgenic plants which arise from sexual cross or by selfing are descendants of 
such a plant. 

As used herein an "embryo-specific gene" or "seed specific gene" is a gene 
that is preferentially expressed during embryo development in a plant. For purposes of 
this disclosure, embryo development begins with the first cell divisions in the zygote and 
continues through the late phase of embryo development (characterized by maturation, 
desiccation, dormancy), and ends with the production of a mature and desiccated seed. 
Embryo-specific genes can be further classified as "eariy phase-specific" and "late phase- 
specific". Early phase-specific genes are those expressed in embryos up to the end of embryo 
morphogenesis. Late phase-specific genes are those expressed from maturation through to 
production of a mature and desiccated seed. 

A "'LECl polynucleotide" is a nucleic acid sequence comprising (or consisting 
of) a coding region of about 100 to about 900 nucleotides, sometimes fi-om about 300 to 
about 630 nucleotides, which hybridizes to SEQ. ID. No. 1 under stringent conditions (as 
defined below), or which encodes a LECl polypeptide. LECl polynucleotides can also be 
identified by their ability to hybridize under low stringency conditions {e,g,, Tm -40*'C) to 
nucleic acid probes having a sequence from position 1 to 81 in SEQ. ID. NO. 1 or from 
position 355 to 627 in SEQ. ID. NO. 1. 
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A "promoter from a LECl gene" or '"LECl promoter" will typically be about 
500 to about 2000 nucleotides in length, usually from about 750 to 1500. An exemplary 
promoter sequence is shown as nucleotides 1-1998 of SEQ. ID. No. 3. KLECl promoter 
can also be identified by its ability to direct expression in all, or essentially all, proglobular 
5 embryonic cells, as well as cotyledons and axes of a late embryo. 

A "LECl polypeptide" is a sequence of about 50 to about 210, sometimes 100 
to 150, amino acid residues encoded by di LECl polynucleotide. A full length LECl 
polypeptide and fragments containing a CCAAT binding factor (CBF) domain can act as a 
subunit of a protein capable of acting as a transcription factor in plant cells. LECl 
10 polypeptides are often distinguished by the presence of a sequence which is required for 
binding the nucleotide sequence: CCAAT. In particular, a short region of seven residues 
(MPIANVI) at residues 34-40 of SEQ. ID No. 3 shows a high degree of similarity to a region 
that has been shown to required for binding the CCAAT box. Similarly, residues 61-72 of 
SEQ. ID No. 3 (IQECVSEYISFV) is nearly identical to a region that contains a subunit 
15 interaction domain (Xing, et aL, (1993) EMBOJ, 12:4647-4655). 

As used herein, a homolog of a particular embryo-specific gene {e.g., SEQ. 
ID. No. 1) is a second gene in the same plant type or in a different plant type, which has a 
polynucleotide sequence of at least 50 contiguous nucleotides which are substantially identical 
(determined as described below) to a sequence in the first gene. It is believed that, in general, 
20 homologs share a common evolutionary past. 

A "polynucleotide sequence from" a particular embryo-specific gene is a 
subsequence or full length polynucleotide sequence of an embryo-specific gene which, when 
present in a transgenic plant, has the desired effect, for example, inhibiting expression of the 
endogenous gene driving expression of an heterologous polynucleotide. A fiill length 
25 sequence of a particular gene disclosed here may contain about 95%, usually at least about 
98% of an entire sequence shown in the Sequence Listing, below. 

In the case of both expression of transgenes and inhibition of endogenous 
genes (e.g., by antisense, or sense suppression) one of skill will recognize that the inserted 
polynucleotide sequence need not be identical and may be "substantially identical" to a 
30 sequence of the gene from which it was derived. As explained below, these variants are 
specifically covered by this term. 
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In the case where the inserted polynucleotide sequence is transcribed and 
translated to produce a functional polypeptide, one of skill will recognize that because of 
codon degeneracy a number of polynucleotide sequences will encode the same polypeptide. 
These variants are specifically covered by the term "polynucleotide sequence from" a 
particular embryo-specific gene, such as LECL In addition, the term specifically includes 
sequences (e.g., fiill length sequences) substantially identical (determined as described below) 
with a LECJ gene sequence and that encode proteins that retain the fiinction of a LEG 1 
polypeptide. 

In the case of polynucleotides used to inhibit expression of an endogenous 
gene, the introduced sequence need not be perfectly identical to a sequence of the target 
endogenous gene. The introduced polynucleotide sequence will typically be at least 
substantially identical (as determined below) to the target endogenous sequence. 

Two nucleic acid sequences or polypeptides are said to be "identical" if the 
sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same 
when aligned for maximum correspondence as described below. The term "complementary 
to" is used herein to mean that the sequence is complementary to all or a portion of a 
reference polynucleotide sequence. 

Optimal alignment of sequences for comparison may be conducted by the local 
homology algorithm of Smith and Waterman .4 c/c/. APL. Math. 2:482 (1981), by the 
homology alignment algorithm of Needle man and WunschJ. MoL Biol 48:443 (1970), by 
the search for similarity method of Pearson and Lipman Proc, Natl Acad, Sci. (U.S.A.) 85: 
2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, 
BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics 
Computer Group (GCG), 575 Science Dr., Madison, WI), or by inspection. 

"Percentage of sequence identity" is determined by comparing two optimally 
aligned sequences over a comparison window, wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions or deletions (i.e., gaps) as 
compared to the reference sequence (which does not comprise additions or deletions) for 
optimal alignment of the two sequences. The percentage is calculated by determining the 
number of positions at which the identical nucleic acid base or amino acid residue occurs in 
both sequences to yield the number of matched positions, dividing the number of matched 
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positions by the total number of positions in the window of comparison and multiplying the 
result by 100 to yield the percentage of sequence identity. 

The term "substantial identity" of polynucleotide sequences means that a 
polynucleotide comprises a sequence that has at least 80% sequence identity, preferably at 
least 85%, more preferably at least 90% and most preferably at least 95%, compared to a 
reference sequence using the programs described above (preferably BLAST) using standard 
parameters. One of skill will recognize that these values can be appropriately adjusted to 
determine corresponding identity of proteins encoded by two nucleotide sequences by taking 
into account codon degeneracy, amino acid similarity, reading frame positioning and the like. 
Substantial identity of amino acid sequences for these purposes normally means sequence 
identity of at least 40%, preferably at least 60%, more preferably at least 90%, and most 
preferably at least 95%. Polypeptides which are "substantially similar" share sequences as 
noted above except that residue positions which are not identical may differ by conservative 
amino acid changes. Conservative amino acid substitutions refer to the interchangeability of 
residues having similar side chains. For example, a group of amino acids having aliphatic 
side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having 
aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide- 
containing side chains is asparagine and glutamine; a group of amino acids having aromatic 
side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic 
side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur- 
containing side chains is cysteine and methionine. Preferred conservative amino acids 
substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, 
alanine-valine, aspartic acid-glutamic acid, and asparagine-glutamine. 

Another indication that nucleotide sequences are substantially identical is if 
two molecules hybridize to each other, or a third nucleic acid, under stringent conditions. 
Stringent conditions are sequence dependent and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5° C lower than the thermal melting 
point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the 
temperature (under defined ionic strength and pH) at which 50% of the target sequence 
hybridizes to a perfectly matched probe. Typically, stringent conditions will be those in which 
the salt concentration is about 0.02 molar at pH 7 and the temperature is at least about 60° C. 
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In the present invention, mRNA encoded by embryo-specific genes of the 
invention can be identified in Northern blots under stringent conditions using cDNAs of the 
invention or fragments of at least about 100 nucleotides. For the purposes of this disclosure, 
stringent conditions for such RNA-DNA hybridizations are those which include at least one 
wash in 0.2X SSC at 63 °C for 20 minutes, or equivalent conditions. Genomic DNA or 
cDNA comprising genes of the invention can be identified using the same cDNAs (or 
fragments of at least about 100 nucleotides) under stringent conditions, which for purposes of 
this disclosure, include at least one wash (usually 2) in 0.2X SSC at a temperature of at least 
about SO^C, usually about 55 "C, for 20 minutes, or equivalent conditions. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a restriction map of the 7.4 kb genomic wild-type fragment shown 

in SEQ. ID. No. 4. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 
The present invention provides new embryo-specific genes useful in genetically 
engineering plants. Polynucleotide sequences from the genes of the invention can be used, for 
instance, to direct expression of desired heterologous genes in embryos (in the case of 
promoter sequences) or to modulate development of embryos or other organs (e.g., by 
enhancing expression of the gene in a transgenic plant). In particular, the invention provides a 
new gene from Arabidopsis referred to here as LECl. LECl encodes polypeptides which 
subunits of a protein which acts as a transcription factor. Thus, modulation of the expression 
of this gene can be used to manipulate a number of useful traits, such as increasing or 
decreasing storage protein content in cotyledons or leaves. 

Generally, the nomenclature and the laboratory procedures in recombinant 
DNA technology described below are those well known and commonly employed in the art. 
Standard techniques are used for cloning, DNA and RNA isolation, amplification and 
purification. Generally enzymatic reactions involving DNA ligase, DNA polymerase, 
restriction endonucleases and the like are performed according to the manufacturer's 
specifications. These techniques and various other techniques are generally performed 
according to Sambrook et ai. Molecular Cloning - A Laboratory Manual, 2nd. ed., Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989). 
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Isolation of nucleic acids of the invention 

The isolation of sequences from the genes of the invention may be 
accomplished by a number of techniques. For instance, oligonucleotide probes based on the 
sequences disclosed here can be used to identify the desired gene in a cDNA or genomic 
DNA library from a desired plant species. To construct genomic libraries, large segments of 
genomic DNA are generated by random fragmentation, e.g. using restriction endonucleases, 
and are ligated with vector DNA to form concatemers that can be packaged into the 
appropriate vector. To prepare a library of embryo-specific cDNAs, mRNA is isolated from 
embryos and a cDNA library v/hich contains the gene transcripts is prepared from the mRNA. 

The cDNA or genomic library can then be screened using a probe based upon 
the sequence of a cloned embryo-specific gene such as the polynucleotides disclosed here. 
Probes may be used to hybridize with genomic DNA or cDNA sequences to isolate 
homologous genes in the same or different plant species. 

Alternatively, the nucleic acids of interest can be amplified from nucleic acid 
samples using amplification techniques. For instance, polymerase chain reaction (PGR) 
technology to amplify the sequences of the genes directly from mRNA, from cDNA, from 
genomic libraries or cDNA libraries. PGR and other in vitro amplification methods may also 
be useful, for example, to clone nucleic acid sequences that code for proteins to be expressed, 
to make nucleic acids to use as probes for detecting the presence of the desired mRNA in 
samples, for nucleic acid sequencing, or for other purposes. 

Appropriate primers and probes for identifying embryo-specific genes from 
plant tissues are generated from comparisons of the sequences provided herein. For a general 
overview of PGR see PCR Protocols: A Guide to Methods and Applications, (Innis, M, 
Gelfand, D., Sninsky, J. and White, T., eds.). Academic Press, San Diego (1990). 
Appropriate primers for this purpose include, for instance: UP primer - 5' GGA ATT GAG 
CAA CAA GGG AAG GGC A 3" and LP primer - 5' LP primer - 5' GCT CTA GAG ATA 
CAA GAG TTT TGG TTA 3'. Alternatively, the following primer pairs can be used: 5' 
ATG AGG AGG TGA GTG ATA GTA GG 3' and 5' GGG AGA GAT GGT GGT TGG TGC 
TG 3* or 5' GAG ATA GAG AGG GAT GGT GGT TG 3' and 5' TGA GTT ATA GTG ACC 
ATA ATG GTG 3'. The amplifications conditions are typically as follows. Reaction 
components: 10 mM Tris-HGl, pH 8.3, 50 mM potassium chloride, 1.5 mM magnesium 
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chloride, 0.001% gelatin, 200 microM dATP, 200 microM dCTP, 200 microM dGTP, 200 
microM dTTP, 0.4 microM primers, and 100 units per ml Taq polymerase. Program: 96 C 
for 3 min., 30 cycles of 96 C for 45 sec, 50 C for 60 sec, 72 for 60 sec, followed by 72 C 
for 5 min. 

Polynucleotides may also be synthesized by well-known techniques as 
described in the technical literature. See, e.g., Carruthers et aL, Cold Spring Harbor Symp, 
Quant. BioL 47:411-418 (1982), and Adams eiaL,J. Am. Chem. Soc. 105:661 (1983). 
Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or by 
adding the complementary strand using DNA polymerase with an appropriate primer 
sequence. 

Use of nucleic acids of the inventi on to inhibit gene expression 

The isolated sequences prepared as described herein, can be used to prepare 
expression cassettes useful in a number of techniques. For example, expression cassettes of 
the invention can be used to suppress endogenous LECl gene expression. Inhibiting 
expression can be useful, for instance, in weed control (by transferring an inhibitory sequence 
to a weedy species and allowing it to be transmitted through sexual crosses) or to produce 
fruit with small and non-viable seed. 

A number of methods can be used to inhibit gene expression in plants. For 
instance, antisense technology can be conveniently used. To accomplish this, a nucleic acid 
segment from the desired gene is cloned and operably linked to a promoter such that the 
antisense strand of RNA will be transcribed. The expression cassette is then transformed into 
plants and the antisense strand of RNA is produced. In plant cells, it has been suggested that 
antisense RNA inhibits gene expression by preventing the accumulation of mRNA which 
encodes the enzyme of interest, see, e.g., Sheehy et al., Proc. NaL Acad. Sci. USA, 
85:8805-8809 (1988), and Hiatt et al., U.S. Patent No. 4,801,340. 

The nucleic acid segment to be introduced generally will be substantially 
identical to at least a portion of the endogenous embryo-specific gene or genes to be 
repressed. The sequence, however, need not be perfectly identical to inhibit expression. The 
vectors of the present invention can be designed such that the inhibitory effect applies to other 
proteins within a family of genes exhibiting homology or substantial homology to the target 
gene 



wo 98/37184 PCT/US98/02998 

12 

For antisense suppression, the introduced sequence also need not be full length 
relative to either the primary transcription product or fully processed mRNA. Generally, 
higher homology can be used to compensate for the use of a shorter sequence. Furthermore, 
the introduced sequence need not have the same intron or exon pattern, and homology of 
5 non-coding segments may be equally effective. Normally, a sequence of between about 30 or 
40 nucleotides and about full length nucleotides should be used, though a sequence of at least 
about 100 nucleotides is preferred, a sequence of at least about 200 nucleotides is more 
preferred, and a sequence of at least about 500 nucleotides is especially preferred. 

Catalytic RNA molecules or ribozymes can also be used to inhibit expression 
10 of embryo-specific genes. It is possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester backbone at a specific location, 
thereby functionally inactivating the target RNA. In carrying out this cleavage, the ribozyme 
is not itself altered, and is thus capable of recycling and cleaving other molecules, making it a 
true enzyme. The inclusion of ribozyme sequences within antisense RNAs confers 
15 RNA-cleaving activity upon them, thereby increasing the activity of the constructs. 

A number of classes of ribozymes have been identified. One class of 
ribozymes is derived from a number of small circular RNAs which are capable of self- 
cleavage and replication in plants. The RNAs replicate either alone (viroid RNAs) or with a 
helper virus (satellite RNAs). Examples include RNAs from avocado sunblotch viroid and the 
20 satellite RNAs from tobacco ringspot virus, lucerne transient streak virus, velvet tobacco 
mottle virus, solanum nodiflorum mottle virus and subterranean clover mottle virus. The 
design and use of target RNA-specific ribozymes is described in Haseloff et al. Nature, 
334:585-591 (1988). 

Another method of suppression is sense suppression. Introduction of 
25 expression cassettes in which a nucleic acid is configured in the sense orientation with respect 
to the promoter has been shown to be an effective means by which to block the transcription 
of target genes. For an example of the use of this method to modulate expression of 
endogenous genes see, Napoli et al., The Plant Cell 2:279-289 (1990), and U.S. Patents Nos. 
5,034,323, 5,231,020, and 5,283,184. 
30 Generally, where inhibition of expression is desired, some transcription of the 

introduced sequence occurs. The effect may occur where the introduced sequence contains 
no coding sequence per se, but only intron or untranslated sequences homologous to 
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sequences present in the primary transcript of the endogenous sequence. The introduced 
sequence generally will be substantially identical to the endogenous sequence intended to be 
repressed. This minimal identity will typically be greater than about 65%, but a higher 
identity might exert a more effective repression of expression of the endogenous sequences. 
5 Substantially greater identity of more than about 80% is preferred, though about 95% to 
absolute identity would be most preferred. As with antisense regulation, the effect should 
apply to any other proteins within a similar family of genes exhibiting homology or substantial 
homology. 

For sense suppression, the introduced sequence in the expression cassette, 
10 needing less than absolute identity, also need not be fiill length, relative to either the primary 
transcription product or fully processed mRNA. This may be preferred to avoid concurrent 
production of some plants which are overexpressers. A higher identity in a shorter than full 
length sequence compensates for a longer, less identical sequence. Furthermore, the 
introduced sequence need not have the same intron or exon pattern, and identity of non- 
15 coding segments will be equally effective. Normally, a sequence of the size ranges noted 
above for antisense regulation is used. 

Another means of inhibiting LECl function in a plant is by creation of 
dominant negatives. In this approach, non-functional, mutant LECl polypeptides, which 
retain the ability to interact with wild-type subunits are introduced into a plant. Identification 
20 of residues that can be changed to create a dominant negative can be determined by published 
work examining interaction of different subunits of CBF homologs from different species 
{see, e,g„ Sinhae/a/., (1995). Proc, Natl. Acad ScL USA 92:1624-1628.) 

Use of nucleic acids of the invention to enhance gene expression 

25 Isolated sequences prepared as described herein can also be used to prepare 

expression cassettes which enhance or increase endogenous LECJ gene expression. Where 
overexpression of a gene is desired, the desired gene from a different species may be used to 
decrease potential sense suppression effects. Enhanced expression of LECl polynucleotides 
is useful, for example, to increase storage protein content in plant tissues. Such techniques 

30 may be particularly useful for improving the nutritional value of plant tissues. 

One of skill will recognize that the polypeptides encoded by the genes of the 
invention, like other proteins, have different domains which perform different functions. 
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Thus, the gene sequences need not be full length, so long as the desired functional domain of 
the protein is expressed. As explained above, LECl polypeptides share sequences with CBF 
proteins. The DNA binding activity, and, therefore, transcription activation function, of 
LECl polypeptides is thought to be modulated by a short region of seven residues 
5 (MPIANVI) at residues 34-40 of SEQ. ID No. 2. Thus, the polypeptides of the invention will 
oflen retain these sequences. Modified protein chains can also be readily designed utilizing 
various recombinant DNA techniques well known to those skilled in the art and described 
for instance, in Sambrook et aL, supra, Hydroxylamine can also be used to introduce single 
base mutations into the coding region of the gene (Sikorski, et al, (1991). Meth. Emymol 
10 194: 302-318). For example, the chains can vary from the naturally occurring sequence at 
the primary structure level by amino acid substitutions, additions, deletions, and the like. 
These modifications can be used in a number of combinations to produce the final 
modified protein chain. 

Desired modified LECl polypeptides can be identified using assays to 
15 screen for the presence or absence of wild type LECl activity. Such assays can be based 
on the ability of the LECl protein to functionally complement the hap3 mutation in yeast. As 
noted above, it has been shown that homologs from different species functionally interact with 
yeast subunits of the CBF. (Sinha, etal, (1995). Proc, Natl Acad ScL USA 92:1624-1628); 
see, also, Becker, etal, (1991). Proc, Natl Acad ScL USA 88: 1968-1972). The reporter 
20 for this screen can be any of a number of standard reporter genes such as the lacZ gene 

encoding P-galactosidase that is fused with the regulatory DNA sequences and promoter of 
the yeast CYCl gene. This promoter is regulated by the yeast CBF. 

A plasmid containing the LECl cDNA clone is mutagenized in vitro according 
to techniques well known in the art. The cDNA inserts are excised from the plasmid and 
25 inserted into the cloning site of a yeast expression vector such as pYES2 (Invitrogen). The 

plasmid is introduced into hap3- yeast containing a lacZ reporter that is regulated by the yeast 
CBF such as pLG265UPl-lacZ (Guarente, et al, (1984) Cell 36: 317-321). Transformants 
are then selected and a filter assay is used to test colonies for p-galactosidase activity. After 
confirming the results of activity assays, immunochemical tests using a LECl antibody are 
30 performed on yeast lines that lack p-galactosidase activity to identify those that produce 

stable LECl protein but lack activity. The mutant LECl genes are then cloned from the yeast 
and their nucleotide sequence determined to identify the nature of the lesions. 
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In other embodiments, the promoters derived from the LECl genes of the 
invention can be used to drive expression of heterologous genes in an embryo-specific or 
seed-specific manner, such that desired gene products are present in the embryo, seed, or 
fruit. Suitable structural genes that could be used for this purpose include genes encoding 

5 proteins useful in increasing the nutritional value of seed or fruit. Examples include genes 
encoding enzymes involved in the biosynthesis of antioxidants such as vitamin A, vitamin 
C, vitamin E and melatonin. Other suitable genes encoding proteins involved in 
modification of fatty acids, or in the biosynthesis of lipids, proteins, and carbohydrates. 
Still other genes can be those encoding proteins involved in auxin and auxin analog 

10 biosynthesis for increasing fruit size, genes encoding pharmaceutical ly useful compounds, 
and genes encoding plant resistance products to combat fungal or other infections of the 
seed. 

Typically, desired promoters are identified by analyzing the 5' sequences of 
a genomic clone corresponding to the embryo-specific genes described here. Sequences 

15 characteristic of promoter sequences can be used to identify the promoter. Sequences 
controlling eukaryotic gene expression have been extensively studied. For instance, 
promoter sequence elements include the TATA box consensus sequence (TAT A AT), 
which is usually 20 to 30 base pairs upstream of the transcription start site. In most 
instances the TATA box is required for accurate transcription initiation. In plants, further 

20 upstream from the TATA box, at positions -80 to -100, there is typically a promoter 

element with a series of adenines surrounding the trinucleotide G (or T) N G. J. Messing 
et al., in Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, 
eds. (1983)). 

A number of methods are known to those of skill in the art for identifying and 
25 characterizing promoter regions in plant genomic DNA {see, e.g., Jordano, ei ai. Plant Cell, 
1: 855-866 (1989); Bustos, eiai. Plant Cell 1:839-854 (1989); Green, etal, EMBO J. 7, 
4035-4044 (1988); Meier, et ai. Plant Cell, 3, 309-316 (1991); and Zhang, et aL, Plant 
Physiology 110: 1069-1079 (1996)). 



30 Preparation of recombinant vectors 

To use isolated sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 
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transforming a wide variety of higher plant species are well known and described in the 
technical and scientific literature. See, for example, Weising et al. Ann. Rev, Genet. 
22:421-477 (1988). A DNA sequence coding for the desired polypeptide, for example a 
cDNA sequence encoding a full length protein, will preferably be combined with 
5 transcriptional and translational initiation regulatory sequences which will direct the 
transcription of the sequence from the gene in the intended tissues of the transformed 
plant. 

For example, for overexpression, a plant promoter fragment may be 
employed which will direct expression of the gene in all tissues of a regenerated plant. 

10 Such promoters are referred to herein as "constitutive" promoters and are active under 

most environmental conditions and states of development or cell differentiation. Examples 
of constitutive promoters include the cauliflower mosaic virus (CaMV) 35S transcription 
initiation region, the T- or T- promoter derived from T-DNA of Agrobacterium 
tumafaciens, and other transcription initiation regions from various plant genes known to 

15 those of skill. 

Alternatively, the plant promoter may direct expression of the 
polynucleotide of the invention in a specific tissue (tissue-specific promoters) or may be 
otherwise under more precise environmental control (inducible promoters) . Examples of 
tissue-specific promoters under developmental control include promoters that initiate 

20 transcription only in certain tissues, such as fruit, seeds, or flowers. As noted above, the 
promoters from the LECl genes described here are particularly useful for directing gene 
expression so that a desired gene product is located in embryos or seeds. Other suitable 
promoters include those from genes encoding storage proteins or the lipid body membrane 
protein, oleosin. Examples of enviroimiental conditions that may affect transcription by 

25 inducible promoters include anaerobic conditions, elevated temperature, or the presence of 
light. 

If proper polypeptide expression is desired, a polyadenylation region at the 
3 '-end of the coding region should be included. The polyadenylation region can be 
derived from the natural gene, from a variety of other plant genes, or from T-DNA. 
30 The vector comprising the sequences {e.g., promoters or coding regions) 

from genes of the invention will typically comprise a marker gene which confers a 
selectable phenotype on plant cells. For example, the marker may encode biocide 
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resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosluforon or 
Basta. 

5 Production of transgenic plants 

DNA constructs of the invention may be introduced into the genome of the 
desired plant host by a variety of conventional techniques. For example, the DNA 
construct may be introduced directly into the genomic DNA of the plant cell using 
techniques such as electroporation and microinjection of plant cell protoplasts, or the DNA 

10 constructs can be introduced directly to plant tissue using ballistic methods, such as DNA 
particle bombardment. Alternatively, the DNA constructs may be combined with suitable 
T-DNA flanking regions and introduced into a conventional Agrobacterium tumefaciens 
host vector. The virulence functions of the Agrobacterium tumefaciens host will direct the 
insertion of the construct and adjacent marker into the plant cell DNA when the cell is 

15 infected by the bacteria. 

Microinjection techniques are known in the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al. Embo J, 3:2717-2722 (1984). 
Electroporation techniques are described in Fromm et al. Proc. NatL Acad. Sci, USA 

20 82:5824 (1985). Ballistic transformation techniques are described in Klein et al. Nature 
327:70-73 (1987). 

Agrobacterium tumefaciens-mt,d\dXcd transformation techniques, including 
disarming and use of binary vectors, are well described in the scientific literature. See, 
for example Horsch et al. Science 233:496-498 (1984), and Fraley et al. Proc, Natl Acad, 

25 Sci. USA 80:4803 (1983). 

Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype and thus the desired phenotype such as seedlessness. Such 
regeneration techniques rely on manipulation of certain phytohormones in a tissue culture 

30 growth medium, typically relying on a biocide and/or herbicide marker which has been 
introduced together with the desired nucleotide sequences. Plant regeneration from 
cultured protoplasts is described in Evans et al.. Protoplasts Isolation and Culture, 
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Handbook of Plant Cell Culture, pp. 124-176, MacMillilan Publishing Company, New 
York, 1983; and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC 
Press, Boca Raton, 1985, Regeneration can also be obtained from plant callus, explants, 
organs, or parts thereof. Such regeneration techniques are described generally in Klee et 
al. Ann. Rev, of Plant Phys, 38:467-486 (1987). 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant. Thus, the invention has use over a broad range of plants, including 
species from the genera Asparagus, Atropa, Avena, Brassica, Citrus, Citrullus, Capsicum, 
Cucumis, Cucurbita, Daucus, Fragaria, Glycine, Gossypium, Helianthus, Heterocallis, 
Hordeum, Hyoscyamus, Lactuca, Linum, Lolium, Lycopersicon, Malus, Manihot, 
Majorana, Medicago, Nicotiana, Oryza, Panieum, Pannesetum, Persea, Pisum, Pyrus, 
Prunus, Raphanus, Secale, Senecio, Sinapis, Solanum, Sorghum, Trigonella, Triticum, 
Vitis, Vigna, and, Zea. The LECl genes of the invention are particularly useful in the 
production of transgenic plants in the genus Brassica. Examples include broccoli, 
cauliflower, brussel sprouts, canola, and the like. 

One of skill will recognize that after the expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into 
other plants by sexual crossing. Any of a number of standard breeding techniques can be 
used, depending upon the species to be crossed. 

Example 1 

This example describes the isolation and characterization of an exemplary 

LECI gene. 

Experimental Procedures 

Plant Material 

A lea J -2 mutant was identified from a population of Arabidopsis thaliana 
ecotype Wassilewskija (Ws-O) lines mutagenized with T-DNA insertions as described before 
(West et al., 1994). The abi3~3,fus3-3 and lecI-J mutants were generously provided by 
Peter McCourt, University of Toronto and David Meinke, Oklahoma State University. Wild 
type plants and mutants were grown under constant light at 22^*0. 

Double mutants were constructed by intercrossing the mutant lines led- J, 
led -2, abi3-3,fus3-3, and lec2. The genotype of the double mutants was verified through 
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backcrosses with each parental line. Double mutants were those who failed to complement 
both parent lines. Homozygous single and double mutants were generated by germinating 
intact seeds or dissected mature embryos before desiccation on basal media. 
Isolation and Sequence analysis of Genomi c and cDNA Clones 

Genomic libraries of Ws-O wild type plants, lecl-1 and led -2 mutants were 
made in GEMl 1 vector according to the instructions of the manufacturer (Promega). Two 
silique-specific cDNA libraries (stages globular to heart and heart to young torpedo) were 
made in ZAPII vector (Stratagene). 

The genomic library oilecI-2 was screened using right and left T-DNA 
specific probes according to standard techniques. About 1 2 clones that cosegragate with the 
mutation, were isolated and purified and the entire DNAs were further labeled and used as 
probes to screen a southern blot containing wild type and lecl-1 genomic DNA. One clone 
hybridized with plant DNA and was further analyzed. A 7. 1 kb Xhol fragment containing 
the left border and the plant sequence flanking the T-DNA was subcloned into 
pBluescript-KS plasmid (Stratagene) to form ML7 and sequenced using a left border specific 
primer (5' GCATAGATGCACTCGAAATCAGCC 3'). The T-DNA organization was 
partially verified using southern analysis with T-DNA left and right borders and PBR322 
probes. The results suggested that the other end of the T-DNA is also composed of left 
border. This was confirmed by generating a PGR fragment using a genomic plant DNA 
primer (LP primerS' GCT CTA GAC ATA CAA CAC TTT TCC TTA 3') and a T-DNA left 
border specific primer (5' GCTTGGTAATAATTGTCATTAG 3') and sequencing. 

The EcoRI insert of ML7 was used to screen a wild type genomic library. 
Two overlapping clones were purified and a 7.4 EcoRI genomic fragment from the wild type 
DNA region was subcloned into pBluescript-KS plasmid making WT74. This fragment was 
sequenced (SEQ. ID. No. 4) and was used to screen lecl-1 genomic library and wild type 
silique-specific cDNA libraries. 8 clones from the lecl-1 genomic library were identified and 
analyzed by restriction mapping. 

From these clones the exact site of the deletion in lecl-1 was mapped and 
sequenced by amplifying a Xbp PGR fragment using primers (H21 - 5' H21 - 5' GTA AAA 
ACA TCT AGG GTT GA 3'; H 17 - 5' TTT GTG GTT GAC CGT TTG GC 3*) flanking the 
deletion region in lecl-1 genomic DNA. Clones were isolated from both cDNA libraries 
and partially sequenced. The sequence of the cDNA clones and the wild type genomic clone 
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matched exactly, confirming that both derived from the same locus. All hybridizations were 
performed under stringent conditions with 32P random prime probes (Stratagene). 

Sequencing was done using the automated dideoxy chain termination method 
(Applied Biosystems, Foster City, CA). Data base searches were performed at the National 
5 Center for Biotechnology Information by using the BLAST network service. Alignment of 
protein sequences was done using PILEUP program (Genetics Computer Group, Madison, 
Wl) 

DNA and RNA blot analysis 

Genomic DNA was isolated from leaves by using the CTAB-containing buffer 

10 Dellaporta, ef ai, (1983). Plant Mol Biol Reporter 1: 19-21. Two micrograms of DNA 
was digested with different restriction endonucleases, electrophoretically separated in 1% 
agarose gel, and transferred to a nylon membrane (Hybond N; Amersham). 

Total RNA was prepared from siliques, two days old seedlings, stems, leaves, 
buds and roots. Poly(A)+ RNA was purified from total RNA by oligo(dT) cellulose 

15 chromatography, and two micrograms of each Poly(A)+ RNA samples were separated, in 1% 
denatured formaldehyde-agarose gel Hybridizations were done under stringent conditions 
unless it specifies otherwise. Radioactive probes were prepared as described above. 
Complementation of led mutants 

A 3 .4 kb Bstyl fragment of genomic DNA (SEQ. ED. No. 3) containing 

20 sequences from 1 .992 kb upstream of the ORF to a region 579 bp downstream from the poly 
A site was subcloned into the hygromycin resistant binary vector pBIB-Hyg. The LECl 
cDNA was placed under the control of the 35S promoter and the ocs polyadenylation signals 
by inserting a PCR fragment spanning the entire coding region into the plasmid pART7. The 
entire regulatory fragment was then removed by digestion with NotI and transferred into the 

25 hygromycin resistant binary vector BJ49. The binary vectors were introduced into the 
Agrobacterium strain GV3101, and constructions were checked by re-isolation of the 
plasmids and restriction enzyme mapping, or by PCR. Transformation to homozygous lecl-1 
and led -2 mutants were done using the in planta transformation procedure (Bechtold, et al., 
(1993). Cowptes Rendus de V Academic des Sciences Serie III Sciences de la Vie, 316: 

30 1 194- 11 99. Dry seeds from led mutants were selected for transformants by their ability to 
germinate after desiccation on plates containing 5g/ml hygromycin. The transformed plants 
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were tested for the present of the transgene by PGR and by screening the siliques for the 
present of viable seeds. 
In Situ Hybridization 

Experiments were performed as described previously by Dietrich et al. (1989) 
5 Plant Cell 1 : 73-80. Sections were hybridized with LECl antisense probe. As a negative 
control, the LECl antisense probe was hybridized to seed sections of led mutants. In 
addition, a sense probe was prepared and reacted with the wild type seed sections. 

Results 

10 Genetic Interaction Between Leafy Cotyledon-Tvpe Mutants and abi3 

In order to understand the genetic pathways which regulate late embryogenesis 
we took advantage of three Arabidopsis mutants lec2,fus3-3 and abi3-3 that cause similar 
defects in late embryogenesis to those of lecl-1 or led -2, These mutants are desiccation 
intolerant, sometimes viviparous and have activated shoot apical meri stems. The lec2 and 

15 fus3-3 mutants are sensitive to ABA and possess trichomes on their cotyledons and therefore 
can be categorized as leafy cotyledon-type mutants (Meinke ei al., 1994). The abi3'3 
mutants belong to a different class of late embryo defective mutations that is insensitive to 
ABA and does not have trichomes on the cotyledons. 

The two classes of mutants were crossed to lecl-l and lecl-2 mutants to 

20 construct plants homozygous to both mutations. The led and lec2 mutations interact 

synergistically, resulting in a double mutant which is arrested in a stage similar to the late 
heart stage, the double mutant embryo, however, is larger. The led or lec2 zx\Afus3-3 
double mutants did not display any epistasis and the resulting embryo had an intermediate 
phenotype. The lecl/abi3-3 double mutants and lec2/abi3-3 double mutants were ABA 

25 insensitive and had a lec-like phenotype. There was no different between double mutants that 
consist of either led-1 or led-2. 

No epistasis was seen between the double mutants indicating that each of the 
above genes, the LEC-type and ABI3 genes, operate in different genetic pathways. 
LECl Functions Early in Embryogenesis 

30 The effects of led is not limited to late embryogenesis, it also has a role in 

early embryogenesis. The embryos of the lecl/lec2 double mutants were arrested in the early 
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stages of development, while the single mutants developed into mature embryos, suggesting 
that these genes act early during development. 

Further examination of the early stages of the single and double mutations 
showed defects in the shape, size and cell division pattern of the mutants suspensors. The 
suspensor of wild type embryo consists of a single file of six to eight cells, whereas the 
suspensors of the mutants are often enlarged and undergo periclinal divisions. Leafy 
cotyledon mutants exhibit suspensor anomalies at the globular or transition stage whereas 
wild type and abi3 mutant do not show any abnormalities. 

The number of anomalous suspensors increases as the embryos continue to 
develop. At the torpedo stage, the wild type suspensor cells undergo programmed cell death, 
but in the mutants secondary embryos often develop from the abnormal suspensors and, when 
rescued, give rise to twins. 

The Organization of the LECl Locus in Wild Tvpe Plants and led Mutants 

Two mutant alleles of the LECl gene have been reported, lecl-1 and lecl-2 
(Meinke, 1992; West et a!., 1994). Both mutants were derived from a population of plants 
mutagenized insertionally with T-DNA (Feldmann and Marks, 1987), ahhough lecl-l is not 
tagged. The lecl-2 mutant contains multiple T-DNA insertions. A specific subset of T-DNA 
fragments were found to be closely linked with the mutation. A genomic library of led -2 
was screened using right and left borders T-DNA as probes. Genomic clones containing 
T-DNA fragments that cosegragate with the mutation were isolated and tested on southern 
blots of both wild type and led -I plants. Only one clone hybridized with Arahidopsis DNA 
and also gave polymorphic restriction fragment in led-L 

The led-1 polymorphism resulted from a small deletion, approximately 2 kb in 
length. Using sequences from the plant fragment flanking the T-DNA, the genomic wild type 
DNA clones and the led-I genomic clones were isolated. An EcoRI fragment of 7.4 kb of 
the genomic wild type DNA that corresponded to the polymorphic restriction fragment in 
lecl-1 was ftjrther analyzed and sequenced. The exact site of the deletion in led-l was 
identified using a PGR fragment that was generated by primers, within the expected borders 
of the deleted fragment, and sequencing. 

In the wild type genomic DNA that corresponded to the lecJ-J deletion, a 626 
bp ORF was identified. Southern analysis of wild type DNA and the two mutants DNA 
probed with the short DNA fragment of the ORF revealed that both the wild type and led -2 
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DNA contain the ORF while the lecl-1 genomic DN A did not hybridize. The exact insertion 
site of the T-DNA in led -2 mutant was determined by PGR and sequencing and it was found 
that the T-DNA was inserted 1 15 bp upstream of the ORF's translational initiation codon in 
the 5* region of the gene. 
5 At the site of the T-DNA insertion a small deletion of 21 plant nucleic acids 

and addition of 20 unknown nucleic acids occurred. These results suggest that in led -2 the 
T-DNA interferes with the regulation of the ORF while in lecl~l the whole gene is deleted. 
Thus, both led alleles contain DNA disruptions at the same locus, confirming the identity of 
XhoLECl locus. 

10 The led Mutants Can Be Complement bv Tr ansformation 

To prove that the 7.4 kb genomic wild type fragment indeed contained the 
ORF of the LECl gene, we used a genomic fragment of 3395 bp (SEQ. ID. No. 3) within 
that fragment to transform homozygous led-1 and led -2 plants. The clone consists of a 
3395 bp BstYI restriction fragment containing the gene and the promoter region. The 
15 translation start codon (ATG) of the polypeptide is at 1999 and the stop codon is at 2625 
(TGA). There are no introns in the gene. 

.The transformed plants were selected on hygromycin plates and were tested to 
contain the wild type DNA fragment by PGR analysis. Both transgenic mutants were able to 
produce viable progeny, that were desiccation tolerant and did not posses trichomes on their 
20 cotyledons. We concluded that the 3.4 kb fragment can complement the led mutation and 
since there is only one ORF in the deleted 2 kb fragment in lecl-l we suggest that this ORF 
corresponds to the LECl gene. 
The LECl Gene is a Member of Gene Family 

In order to isolate the LECl gene two cDNA libraries of young siliques were 
25 screened using the 7.4 kb DNA fragment as a probe. Seventeen clones were isolated and after 
further analysis and partial sequencing they were all found to be identical to the genomic 
ORF. The cDNA contains 626 bp ORF specifying 208 amino acid protein (SEQ. ID. Nos. 1 
and 2). 

The LECl cDNA was used to hybridize a DNA gel blot containing Ws-O 
30 genomic DNA digested with three different restriction enzymes. Using low stringency 

hybridization we found that there is at least one more gene. This confirmed our finding of 
two more Arabidopsis ESTs that show homology to the LECl gene. 
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The LECl gene is Embryo Specific 

The led mutants are affected mostly during embryogenesis. Rescued mutants 
can give rise to homozygous plants that have no obvious abnormalities other than the 
presence of trichomes on their cotyledons and their production of defective progeny. 
5 Therefore, v/e expected the LECl gene to have a role mainly during embryogenesis and not 
during vegetative growth. To test this assumption Poly (A)+ RNA was isolated from siliques, 
seedling, roots, leaves, stems and buds of wild type plants and from siliques of led plants. 
Only one band was detected on northern blots using either the LECl gene as a probe or the 
7.4 kb genomic DNA fragment suggesting that there is only one gene in the genomic DNA 

10 fragment which is active transcriptionally. The transcript was detected only in siliques 

containing young and mature embryos and was not detected in seedlings, roots, leaves, stems 
and buds indicating that the LECl gene is indeed embryo specific. In addition, no RNA was 
detected in siliques of both alleles of led mutants confirming that this ORF corresponds to 
the LECl gene. 

15 Expression Pattern of the LECl Gene 

To study how the LECl gene specifies cotyledons identity, we analyzed its 
expression by in situ hybridization. We specifically focused on young developing embryos 
since the mutants abnormal suspensors phenotype indicates that the LECl gene should be 
active very early during development. 

20 During embryogenesis, the LECl transcript was first detected in proglobular 

embryos. The transcript was found in all cells of the proembryo and was also found in the 
suspensor and the endosperm. However, from the globular stage and on it accumulates more 
in the outer layer of the embryo, namely the protoderm and in the outer part of the ground 
meristem leaving the procambium without a signal. At the torpedo stage the signal was 

25 stronger in the cotyledons and the root meristem, and was more limited to the protoderm 

layer. At the bent cotyledon stage the signal was present throughout the embryo and at the 
last stage of development when the embryo is mature and filling the whole seed we could not 
detect \\\qLEC1 transcript. This might be due to sensitivity limitation and may imply that if 
the LECl transcript is expressed at that stage it is not localized in the mature embryo, but 

30 rather spread throughout the embryo. 

The LECl gene encodes a Homolog of CCAAT binding factor. 
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Comparison of the deduced amino acid sequence of LECl to the GeneBank 
reveals significant similarity to a subunit of a transcription factor, the CCAAT box binding 
factor (CBF). CBFs are highly conserved family of transcription factors that regulate gene 
activity in eukaryotic organisms Mantvani, et al, . (1992). Nucl Acids Res. 20: 1087-1091. 
5 They are hetero-oligomeric proteins that consist of between three to four non-homologous 
subunits. LECl was found to have high similarity to CBF- A subunit. This subunit has three 
domains; A and C which show no conservation between kingdoms and a central domain, B, 
which is highly conserved evolutionary. Similarly the LECl gene is composed of three 
domains. The LECl B domain shares between 75%-85% similarity and 55%-63% identity 

10 with different B domains that are found in organisms ranging from yeast to human. Within 
this central domain, two highly conserved amino acid segments are present. Deletion and 
mutagenesis analysis in the CBF-A yeast homolog hap3 protein demonstrated that a short 
region of seven residues (42-48) (LPIANVA) is required for binding the CCAAT box, while 
the subunit interaction domain lies in the region between residues 69-80 ( MQECVSEFISFV) 

15 (Xing et al, supra), LECl protein shares high homology to those regions. 

DISCUSSION 

The led mutant belongs to the leafy cotyledon class that interferes mainly 
with the embryo program and therefore is thought to play a central regulatory role during 

20 embryo development. It was shown before that LECl gene activity is required to suppress 
germination during the maturation stage. Therefore, we analyzed the genetic interaction of 
homozygous double mutants of the different members of the leafy cotyledon class and the 
abiS mutant that has an important role during embryo maturation. All the five different 
combinations of the double mutants showed either an intermediate phenotype or an additive 

25 effect. No epistatic relationship among the four genes was found. These findings suggest 
that the different genes act in parallel genetic pathways. Of special interest was the double 
mutant lecl/lec2 that was arrested morphologically at the heart stage, but continued to grow 
in that shape. This double mutant phenotype indicates that both genes LECl and LECl are 
essential for early morphogenesis and their products may interact directly or indirectly in the 

30 young developing embryo. 

The Role of LECl in Embrvogenesis 
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One of the proteins that mediate CCAAT box function, is an heteromeric 
protein called CBF (also called NFY or CPl). CBF is a transcription activator that regulates 
constitutively expressed genes, but also participates in differential activation of developmental 
genes Wingender, E. (1993). Gene Regulation in Eukaryoies (New York: VCH Publishers). 
In mammalian cells, three subunits have been identified CBF-A, CBF-B and CBF-C and all of 
which are required for DNA binding. In yeast, the CBF homolog HAP activates the CYCl 
and other genes involved in the mitochondrial electron transport Johnson, et ai, Proteins. 
Annu, Rev. Biochem, 58, 799-840. (1989). HAP consists of four subunits hap2, hap3, hap4 
and hap5. Only hap2, 3 and 5 are required for DNA binding. CBF- A, B and C show high 
similarity to the yeast hap3, 2 and 5, respectively. It was also reported that mammalian 
CBF-A and B can be functionally interchangeable vAth the corresponding yeast subunits 
(Sinha et al, supra.). 

The LECl gene encodes a protein that shows more then 75% similarity to the 
conserved region of CBF-A. CCAAT motifs are not common in plants' promoters and their 
role in transcription regulation is not clear. However, maize and Brassica homologs have 
been identified Search in the Arabidopsis GeneBank revealed several ESTs that show high 
similarity to CBF-A, B and C. Accession numbers of CBF-A (HAP3) homologs: H37368, 
H76589; CBF-B (HAP2) homologs: T20769; CBF-C (HAP5) homologs: T43909, T44300. 
These findings and the pleiotropic affects of LECl suggest that LECl is a member of a 
heteromeric complex that functions as a transcription factor. 

The model suggests that LECl acts as transcription activator to several sets of 
genes, which keep the embryonic program on and repress the germination process. 
Defective LECl expression partially shuts down the embryonic program and as a result the 
cotyledons lose their embryonic characteristics and the germination program is active in the 
embryo. 

Example 2 

This example demonstrates that LECl is sufficient to induce embryonic 
pathways in transgenic plants. 

The phenotype of led mutants and the gene's expression pattern indicated 
that LECl functions specifically during embryogenesis. A LECl cDNA clone under the 
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control of the cauliflower mosaic virus 35S promoter was transferred into lecl-1 mutant 
plants in planta using standard methods as described above. 

Viable dry seeds were obtained from lecl-1 mutants transformed with the 
35S/LEC1 construct. However, the transformation efficiency was only approximately 0.6% 
of that obtained normally. In several experiments, half the seeds that germinated (12/23) 
produced seedlings with an abnormal morphology. Unlike wild type seedlings, these 
35S/LEC1 seedlings possessed cotyledons that remained fleshy and that failed to expand. 
Roots often did not extend or extended abnormally and sometimes greened. These seedHngs 
occasionally produced a single pair of organs on the shoot apex at the position normally 
occupied by leaves. Unlike wild type leaves, these organs did not expand and did not possess 
trichomes. Morphologically, these leaf-like structures more closely resembled embryonic 
cotyledons than leaves. 

The other 35S/LEC1 seeds that remained viable after drying produced plants 
that grow vegetatively. The majority of these plants (7) flowered and produced 100% led 
mutant seeds. Amplification experiments confirmed that the seedlings contained the 
transgene, suggesting that the 35S/LEC1 gene was inactive in these T2 seeds. No vegetative 
abnormalities were observed in these plants with the exception that a few displayed defects in 
apical dominance. A few plants (2) were male sterile and did not produce progeny. One 
plant that produced progeny segregated 25% mutant Led' seeds that, when germinated 
before desiccation and grown to maturity, gave rise to 100% mutant seed, as expected for a 
single transgene locus. The other 75% of seeds contained embryos with either a wild type 
phenotype or a phenotype intermediate between led mutants and wild type. Only 25% of the 
dry seed from this plant germinated, and all seedlings resembled the embryo-like seedlings 
described above. Some seedlings continued to grow and displayed a striking phenotype. 
These 35S/LEC1 plants developed two types of structures on leaves. One type resembled 
embryonic cotyledons while the other looked like intact torpedo stage embryos. Thus, 
ectopic expression of LEG 1 induces the morphogenesis phase of embryo development in 
vegetative cells. 

Because many 35S/LEC1 seedlings exhibited embryonic characteristics, the 
seedlings were analyzed for expression of genes specifically active in embryos. Cruciferin A 
storage protein mRNA accumulated throughout the 35S/LEC1 seedlings, including the leaf- 
like structures. Proteins with sizes characteristic of 12S storage protein cruciferin 
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accumulated in these transgenic seedlings. Thus, 35S/LECJ seedings displaying an embryo- 
like phenotype accumulated embryo-specific mRNAs and proteins. LECl mRNA 
accumulated to a high level in these 35S/LECJ seedlings in a pattern similar to early stage 
embryos but not in wild type seedlings. LECl is therefore sufficient to alter the fate of 

5 vegetative cells by inducing embryonic programs of development. 

The ability of LECl to induce embryonic programs of development in 
vegetative cells establishes the gene as a central regulator of embryogenesis. LECl is 
sufficient to induce both the seed maturation pathway as indicated by the induction of storage 
protein genes in the 25S/LEC1 seedlings. The presence of ectopic embryos on leaf surfaces 

10 and cotyledons at the position of leaves also shows that LECl can activate the embryo 

morphogenesis pathway. Thus, LECl regulates both early and late embryonic processes. 

The above examples are provided to illustrate the invention but not to limit its 
scope. Other variants of the invention will be readily apparent to one of ordinary skill in the 
15 art and are encompassed by the appended claims. All publications, patents, and patent 
applications cited herein are hereby incorporated by reference. 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 627 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 627 
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(D) OTHER INFORMATION: /product= "LECl" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG ACC AGC TCA GTC ATA GTA GCC GGC GCC GGT GAC AAG AAC AAT GGT 4 8 

Met Thr Ser Ser Val He Val Ala Gly Ala Gly Asp Lys Asn Asn Gly 
15 10 15 

ATC GTG GTC CAG CAG CAA CCA CCA TGT GTG GCT CGT GAG CAA GAC CAA 96 
He Val Val Gin Gin Gin Pro Pro Cys Val Ala Arg Glu Gin Asp Gin 
20 25 30 

TAC ATG CCA ATC GCA AAC GTC ATA AGA ATC ATG CGT AAA ACC TTA CCG 144 
Tyr Met Pro He Ala Asn Val He Arg He Met Arg Lys Thr Leu Pro 
35 40 45 

TCT CAC GCC AAA ATC TCT GAC GAC GCC AAA GAA ACG ATT CAA GAA TGT 192 
Ser His Ala Lys He Ser Asp Asp Ala Lys Glu Thr He Gin Glu Cys 
50 55 60 

GTC TCC GAG TAC ATC AGC TTC GTG ACC GGT GAA GCC AAC GAG CGT TGC 240 
Val Ser Glu Tyr He Ser Phe Val Thr Gly Glu Ala Asn Glu Arg Cys 
65 70 75 80 

CAA CGT GAG CAA CGT AAG ACC ATA ACT GCT GAA GAT ATC CTT TGG GCT 288 
Gin Arg Glu Gin Arg Lys Thr He Thr Ala Glu Asp He Leu Trp Ala 
85 90 95 

ATG AGC AAG CTT GGG TTC GAT AAC TAC GTG GAC CCC CTC ACC GTG TTC 336 
Met Ser Lys Leu Gly Phe Asp Asn Tyr Val Asp Pro Leu Thr Val Phe 
100 105 110 

ATT AAC CGG TAC CGT GAG ATA GAG ACC GAT CGT GGT TCT GCA CTT AGA 3 84 

He Asn Arg Tyr Arg Glu He Glu Thr Asp Arg Gly Ser Ala Leu Arg 
115 120 125 

GGT GAG CCA CCG TCG TTG AGA CAA ACC TAT GGA GGA AAT GGT ATT GGG 4 3 

Gly Glu Pro Pro Ser Leu Arg Gin Thr Tyr Gly Gly Asn Gly He Gly 
130 135 140 

TTT CAC GGC CCA TCT CAT GGC CTA CCT CCT CCG GGT CCT TAT GGT TAT 4 8 

Phe His Gly Pro Ser His Gly Leu Pro Pro Pro Gly Pro Tyr Gly Tyr 
145 150 155 160 

GGT ATG TTG GAC CAA TCC ATG GTT ATG GGA GGT GGT CGG TAC TAC CAA 52 
Gly Met Leu Asp Gin Ser Met Val Met Gly Gly Gly Arg Tyr Tyr Gin 
165 170 175 

AAC GGG TCG TCG GGT CAA GAT GAA TCC AGT GTT GGT GGT GGC TCT TCG 57 
Asn Gly Ser Ser Gly Gin Asp Glu Ser Ser Val Gly Gly Gly Ser Ser 
180 185 190 

TCT TCC ATT AAC GGA ATG CCG GCT TTT GAC CAT TAT GGT CAG TAT AAG 6 2 

Ser Ser He Asn Gly Met Pro Ala Phe Asp His Tyr Gly Gin Tyr Lys 
195 200 205 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 8 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Thr Ser Ser Val lie Val Ala Gly Ala Gly Asp Lys Asn Asn Gly 
15 10 15 

lie Val Val Gin Gin Gin Pro Pro Cys Val Ala Arg Glu Gin Asp Gin 
20 25 30 



Tyr Met Pro He Ala Asn Val He Arg He Met Arg Lys Thr Leu Pro 
20 35 40 45 

Ser His Ala Lys He Ser Asp Asp Ala Lys Glu Thr He Gin Glu Cys 
50 55 60 

25 Val Ser Glu Tyr He Ser Phe Val Thr Gly Glu Ala Asn Glu Arg Cys 

65 70 75 80 



Gin Arg Glu Gin Arg Lys Thr He Thr Ala Glu Asp He Leu Trp Ala 
85 90 95 

Met Ser Lys Leu Gly Phe Asp Asn Tyr Val Asp Pro Leu Thr Val Phe 
100 105 110 



He Asn Arg Tyr Arg Glu He Glu Thr Asp Arg Gly Ser Ala Leu Arg 
35 115 120 125 

Gly Glu Pro Pro Ser Leu Arg Gin Thr Tyr Gly Gly Asn Gly He Gly 
130 135 140 

40 Phe His Gly Pro Ser His Gly Leu Pro Pro Pro Gly Pro Tyr Gly Tyr 

145 150 155 160 



Gly Met Leu Asp Gin Ser Met Val Met Gly Gly Gly Arg Tyr Tyr Gin 
165 170 175 

Asn Gly Ser Ser Gly Gin Asp Glu Ser Ser Val Gly Gly Gly Ser Ser 
180 185 190 



Ser Ser He Asn Gly Met Pro Ala Phe Asp His Tyr Gly Gin Tyr Lys 
50 195 200 205 



(2) INFORMATION FOR SEQ ID NO : 3 : 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 95 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY : linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
AGATCCAAAA CAGGTCATGG ACTGGGCCGT AAACTCTATC CAAAATTCTT CATGTTTTTC 
CATCTTTCAA AAATCTTTAT CCACCATTCC ATTACTAGGG TGTTGGTTTT ATTTTATTTG 
TTGATTAATT ATGTATTAGA AAATGTAAAG CAATATTCAA TTGTAACATG CATCATCTAA 
CACCAATATC TTGTACTAAC CTTTTGTAAT TTTCCTATAA ACATTTTAAA AGGCTAATTT 
AAATAAAAAT TACAATAAAC GTGATAACTC ACTTTCGTAA CGCATATTTA TTCAAATATA 
CCAAAATTTA CCATTTTAAG TAAGAGAATC TTTTTAAAAT TAATTTTCAA TTTCATTAAT 
TAAGAAACAA AGAATTTACT GAAACCTATA TTTTATTAAA TTTTAATAAA ATATATGACT 
AAT^TAACGT CACGTGAATC TTTCTCAGCC GTTCGATAAT CGAATACTTT ATTGACTAAG 
TATTTATTTA GAAAATTTTA AACAACACTT AATTTCTAGA AACAAAGAGA GCCTCATATG 
TATAAAAATC TTCTTCTTAT CTTTCTTTCT TTCTTAATAG TCTTTATTTT TACTTAATTA 
CTTTGGTAAT TTGTGAAAAA CACAACCAAT GAGAGAAGAG CAGTTTGACT GGCCACATAG 
CCAATGAGAC AAGCCAATGG GAAAGAGATA TAGAGACCTC GTAAGAACCG CTCCTTTGCC 
ATTTGTATCA TCTCTCTATA AAACCACTCA ACCATCAACC TNTCTTTGCA TGCAACAAAT 
CACTCAAATA ATTATTTTAT AAAGAACAAA AAAAAAAAGA CGGCAGAGAA ACAATGGAAC 
GTGGAGCTCC CTTCTCTCAC TATCAGCTAC CCTUUVTCCAT CTCTGGTAAT CTAAGTGGCT 
ATTTGTATAC AGTATATACT TGCCTCCATG TATATTTATA TTCTCGTGAA AAATTGGAGA 
CATGCTTTAT GAATTTTATG AGACTTTGCA ACAACGAACG AGATGCTTTC TCTCTAGAAA 
TTTAAATTTA GATTTGTGAA GGTTTTGGGA ATGGCCCGGA GAAGACGATT TTATATATAC 
ATGCATGCAA GAGTTTGATA TGTATATTGT TTCATCATGG CTGAGTCAAA GTTTTATCCA 
AATATTTCCA TGGTGTGGTA TTAGTTAAAC AAATCTCTCG TATGTGTCAT TGAATATACC 
CGTGCATGTA CCAGGAATGT TTTTGATTCT AAAAACGTTT TTTTCTTTGT TGTAACGGTT 
GAGTTTTTTT CTTCGTTTCA AAACGAGATT CTCGTTTGTC TCTTCCCTTG TCTAAAAACA 
TCTACGGTTC ATGTGATTCA AAAACACTAA AAAAATATAA ACTCATTTTT TTTTAATACT 
TAACATTTAA ACTATATATA TATATATATA TATATATATC TTATACTAGT CCCAAGTTTT 
AGTGTGAGGT TTTTTTATTC AAAATCTATC AGTACATTTT TTGGAAAAGA ACTTUVGTGAA 
ATTTTCTCCA AATTTTCCTT TTACTATTGA TTTTTTAATT ACTGGATGTC ATTAACTTTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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ATCTTTTGAT TCTTTCAACG TTTACCATTG GGAACCTTCA CATGAAATAA ATGTCTACTT 162 0 

TATTGAGTCA TACCTTCGTC AACATAAATT AATTGATGTT CTTCTCCAAA TTTTGAGTTT 1680 

TTGGTTTTTC TAATAATCTT AACGAAAGCT TTTTGGTATA CATGTAAAAC GTAACGGCAA 1740 

GAATCTGAAC AGTCTACTCA ACGGGGTCCA TAAGTCTAGA ATGTAGACCC CACAAACTTA 180 0 

CTCTTATCTT ATTGGTCCGT AACTAAGAAC GTGTCCCTCT GATTCTCTTG TTTTCTTCTA 1860 

ATTAATTCGT ATCCTACAAA TTT7UVTTATC ATTTCTACTT CAACTAATCT TTTTTTATTT 192 0 

CCTAAAGATT TCAATTTCTC TCTGTATTTT CTATGAACAG AATTGAACTT GGACCAGCAC 1980 

15 AGCAACAACC CAACCCCAAT GACCAGCTCA GTCATAGTAG CCGGCGCCGG TGACAAGAAC 2 04 0 

AATGGTATCG TGGTCCAGCA GCAACCACCA TGTGTGGCTC GTGAGCAAGA CCAATACATG 2100 

CCAATCGCAA ACGTCATAAG AATCATGCGT AAAACCTTAC CGTCTCACGC CAAAATCTCT 2160 

20 

GACGACGCCA AAGAAACGAT TCAAGAATGT GTCTCCGAGT ACATCAGCTT CGTGACCGGT 222 0 

GAAGCCAACG AGCGTTGCCA ACGTGAGCAA CGTAAGACCA T7VACTGCTGA AGATATCCTT 22 8 0 

25 TGGGCTATGA GCAAGCTTGG GTTCGATAAC TACGTGGACC CCCTCACCGT GTTCATTAAC 234 0 

CGGTACCGTG AGATAGAGAC CGATCGTGGT TCTGCACTTA GAGGTGAGCC ACCGTCGTTG 24 00 
AGACAAACCT ATGGAGGAAA TGGTATTGGG TTTCACGGCC CATCTCATGG CCTACCTCCT 24 60 

30 

CCGGGTCCTT ATGGTTATGG TATGTTGGAC CAATCCATGG TTATGGGAGG TGGTCGGTAC 252 0 

TACCAAAACG GGTCGTCGGG TCAAGATGAA TCCAGTGTTG GTGGTGGCTC TTCGTCTTCC 2 580 

35 ATTAACGGAA TGCCGGCTTT TGACCATTAT GGTCAGTATA AGTGAAGAAG GAGTTATTCT 2 64 0 

TCATTTTTAT ATCTATTCAA AACATGTGTT TCGATAGATA TTTTATTTTT ATGTCTTATC 2 70 0 

AATAACATTT CTATATAATG TTGCTTCTTT AAGGAAAAGT GTTGTATGTC AATACTTTAT 2 76 0 

40 

GAGAAACTGA TTTATATATG CAAATGATTG AATCCAAACT GTTTTGTGGA TTAAACTCTA 2 82 0 

TGCAACATTA TATATTTACA TGATCTAAAG GTTTTGTAAT TCAAAAGCTG TCATAGTTAG 2880 
45 AAGATAACTA AACATTGTAG TAACCAAGTT TAATTTACTT TTTTGAGTTT ACATAACTAA 2 94 0 

CCAAGCCAAA AGGTTATAAA ATCTAAATTC GTTGAGTTGT CAAACTTCTG AAGATTGCTA 300 0 

TCCTCTTTGA GTTGCTTTCT TTTGGGTGCT TGAGTTTCAT TAGGCTGAGC TGACTCGTTG 30 60 

50 

CTCTCTAGTC TTTCATCTCT GTCTTTTCCA AGGATTCATA ACGTTGGTCG CTCTCTGTTT 312 0 

CTGCCTACAC TTCTTCAAGG GATCATTACT GAGGCTAAGA GTTAAAGACC TGAACCATGG 3180 
55 TTTTCTGTAA CTGGTTCAAG TTCATTCTCC GGTTATTGTG TGGTTATCTT TCGGTTAGAT 324 0 

TGAAACCCAT ATGTTTGCTC TGTTTCTTCT AGTTCCAAGT TTAATTTCCG GTTATTGTTT 33 00 
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GGCTTTTTAA AAGTTTTTAA GGTCTATTCT ATGTAAAGAC TATTCTACGT ACGTACATTT 336 0 

ATCGCAAAAT TGAAAGATTA TAAAAAAAAT TGAAA 33 95 

5 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7560 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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20 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

AATTNACCCT CACTAAAGGG AACAAAAGCT GGGTACCGGG CCCCCCCTCG AGGTCGACGG 60 

TATCGATAAG CTTGATATCG AATTCGTGGC CATTAGACCC ATAACTATAT GACGATGTTA 12 0 

AAGAGTU^T AAATCATAAA TAAAATAAGA GTCCTTATCA ATAAACCTAA TTGGCTAATT 180 

25 TCAACCTCAA AGAGTAGTAG GAACAGGTAA GGTGAAGCCA AACAGCTCCT TTTACAGTTG 24 0 

GACCACTAGA GCTGATCTGG CATACAAAGT ATGCTTATTG GGCTGTCACG GCCCATCCGC 3 00 

AAAATGTCGT TGGTTACGAA GCATCCACGA CATAGACGGT GCCACATGTT AGAAAAGTGT 3 60 

30 

TTCGGCGATC AAGATTGTGT CCACATCATT AGACGTCTGA ACTGTCCACG TGTCTATCAA 420 

AGCTGGCGTC AAACATTACG TTTTCGTCGT TTGCGCCTCC TAGTTCACAC GTGCAACGAA 4 80 

35 CGCGTGCGAC GTATCA7VAAT TGTTAATTTT AGCCATGTAT AAAGAATATC TACAAAATTA 54 0 

ACCTCAGGAA TATTTTTGTT TTTTCAATTG AGGCCATAAT ATACNTNCCG ATNGAAAAAT 6 00 

TTTNCANCAT ATCNCTAATA TCAAAAAATT ATGATGTTAG TAAACGTAAA AAATTTACAC 660 

40 

AAAATAANTT TCACAAAACT TANNGGGGAA ATTGGAACAA ANAAAAGACT GGTGAGTGAT 720 

AAGCGATGAT GGCCGGTGAA TCAGGTAGCC GTCCTACAAC GTGGTTGATT TTGAGCAAAC 7 80 

45 TCCTATCTAC TCTTCACACT ATTGGAAATC CCAAAATGTC GTCACACCAT AATAATGTGA 84 0 

ATTTTGTTAT GGAATTTGAG GGAAACAGTA GATATATGTT TCAACCAGTG AAAGTTACCC 90 0 

TCCTTTGGAC ATATCTACGA NAGTAGAAAG TAGAAACATT CACTAAACGT GACAACTTTA 96 0 

50 

TAAATTTTCT TTTTGTAACT TTTCTTTAGA TTTATTTACG ANAAGAGAAA TAT7VAACGTC 102 0 

ATGCTAATAA AAAATGCATT ATTTTCTACC ATCTAGCTAG AATATTGATC AAGTCTTCAC 1080 

55 GTTTTTTGTT TATCTCTTCT CTCATAGGCA TGTCCACAAA AGGGTAAGTT TTACTGGTTC 114 0 

AAAATATTGC ATGAGTACTA CTAAGCTCGT ATAGTTTGAT CTTACTATCA TTGCGATGAG 12 00 



BNSDOCID: <WO 9e37184A1 JB> 



10 



wo 98/37184 PCT/US98/02998 

35 

GGTTGTTAGT TTGGAAGAAA TAAGGATTTA TGCAAATGGT AATCATTATG TCTGCTATTT 12 60 

AAGAAGTAAA TTATGATGCT TGTTGCGTGA ACATATTAAA TTTGCGAAAA ATAAGCTU^GG 132 0 

ATACACGAGA GAAGCTCAGA TATTCACGTA ACGATGTTTC ATCTCTTCTC ATTGAGGAAA 13 80 

CATATGGCCA TGATATAGCT AATAAGCCTA CGGGATTGTC NTTTCAACGC CGAATCTACC 144 0 

AAACTGTTCC ATCTCTTATT ATATATAGTT TGGTTATTTA AGTAATTAGA TGCATCATAA 1500 

TCTTTTTTTC TGCCAGTTGT AATGCAGATA T^AAATATATT GGTTGTTCTA AGGATTGTTC 1560 

AAACGTGCAT GTGTACAAGT TATTATTTAT ATACTTTCAT CTACATGCGA TGCGTTATTT 162 0 

15 ATAATGATAA AACTAAGATT TTTAGTTAAA TTTAATAAAG AGCTTACGAG CTACAATTAA 16 80 

TTAGAAATGG TTGCTCAGAA ATCAGAATAC TATATATGAA AAAAGAAGTT GGTATACTTG 174 0 

AAAAAAGAAA AAACTACTTG AAAAGATGGT AAAAGATATA GAACGAGTAT ATATCTTACT 180 0 

CAAGCACGAT AGAAGTTTGT ATCAAAACAT TGCGTTCCAA ACCAATGTTT GAAGATGGTC 186 0 

AAAGGTGCTA CTCATGATGT GGTGCGAAGA AGCTTACGAA AAATTCTGCA ATGAGAGATA 192 0 

25 ACTTTATGGG CTGCTTGTTC AATATATTGA AAATCATGGT AGACAACACC AAACTCTCCT 19 80 

TTACCAGAAG TCATATTTCC TTAACCTCAG AATAAGTAAA TCTTCTAGTT TATTATTTGA 2 04 0 

AAGTTGAGCG TATAATTGCA ATGAAACTTT TACCAATTCA CCGCCTCCTA ACTGAGTTGT 210 0 

TGTATTATCC TATCTCTTTA GCTATCCTTT CCTTGCTCTT GCTCCACCTG CATGTGGCCT 216 0 



20 



30 





CTTTATTTAT 


AATCTCTCTA 


GATTCTGCTA 


AAGATGTNTG 


TTCAAAATGG 


TTTATCTTTA 


2220 


35 


AGGGAAGCAA 


AGTGAATGGA 


AACATTTAAA 


GAAAAAAAAA 


ACTTTTAGCA 


GAGTTCCATG 


2280 




AGATTTCATA 


CTGATGATAA 


CTAAAATAAT 


CTTATATGCG 


TAAGATTATT 


TTAGTTCTAA 


2340 




ACTTCATTTT 


GAAATGAGAG 


GTCATTGGCC 


AGGAAAGATT 


CAATATTGGT 


TCTTTGTTT^ 


2400 


40 














2460 




TTCTCGTTGG 


TTTGTTTTTA 


GTATGGGCTA 


GATCCAAAAC 


AGGTCATGGA 


CTGGGCCGTA 




AACTCTATCC 


AAAATTCTTC 


ATGTTTTTCC 


ATCTTTCAAA 


AATCTTTATC 


CACCATTCCA 


2520 


45 


TTACTAGGGT 


GTTGGTTTTA 


TTTTATTTGT 


TGATTAATTA 


TGTATTAGAA 


AATGTAAAGC 


2580 




AATATTCAAT 


TGTAACATGC 


ATCATCTAAC 


ACCAATATCT 


TGTACTAACC 


TTTTGTAATT 


2640 


50 


TTCCTATAAA 


CATTTTAAAA 


GGCTAATTTA 


AATAAAAATT 


ACAATAAACG 


TGATAACTCA 


2700 


CTTTCGTAAC 


GCATATTTAT 


TCAAATATAC 


C7\AAATTTAC 


CATTTTAAGT 


AAGAGAATCT 


2760 




TTTTAAAATT 


AATTTTCAAT 


TTCATTAATT 


AAGAAACAT^ 


GAATTTACTG 


AAACCTATAT 


2820 


55 


TTTATTAAAT 


TTTAATAAAA 


TATATGACTA 


AAATAACGTC 


ACGTGAATCT 


TTCTCAGCCG 


2880 




TTCGATAATC 


GAATACTTTA 


TTGACTAAGT 


ATTTATTTAG 


AAAATTTTAA 


ACTyVCACTTA 


2940 
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ATTTCTAGAA ACAAAGAGAG CCTCATATGT ATAAAAATCT TCTTCTTATC TTTCTTTCTT 3 000 

TCTTAATAGT CTTTATTTTT ACTTAATTAC TTTGGTAATT TGTGAAAAAC ACAACCAATG 3 06 0 

AGAGAAGAGC AGTTTGACTG GCCACATAGC CAATGAGACA AGCCAATGGG AAAGAGATAT 312 0 

AGAGACCTCG TAAGAACCGC TCCTTTGCCA TTTGTATCAT CTCTCTATAA AACCACTCAA 3180 

CCATCAACCT NTCTTTGCAT GCAACAAATC ACTCAAATAA TTATTTTATA AAGAACAAAA 3 24 0 

AAAAAAAGAC GGCAGAGAAA CAATGGAACG TGGAGCTCCC TTCTCTCACT ATCAGCTACC 33 00 

CAAATCCATC TCTGGTAATC TAAGTGGCTA TTTGTATACA GTATATACTT GCCTCCATGT 33 6 0 

15 ATATTTATAT TCTCGTGAAA AATTGGAGAC ATGCTTTATG AATTTTATGA GACTTTGCAA 342 0 

CAACGAACGA GATGCTTTCT CTCTAGAAAT TTAAATTTAG ATTTGTGAAG GTTTTGGGAA 34 8 0 

TGGCCCGGAG AAGACGATTT TATATATACA TGCATGCAAG AGTTTGATAT GTATATTGTT 354 0 

20 

TCATCATGGC TGAGTCAAAG TTTTATCCAA ATATTTCCAT GGTGTGGTAT TAGTTAAACA 3 60 0 

AATCTCTCGT ATGTGTCATT GAATATACCC GTGCATGTAC CAGGAATGTT TTTGATTCTA 3 66 0 

25 AAAACGTTTT TTTCTTTGTT GTAACGGTTG AGTTTTTTTC TTCGTTTCAA AACGAGATTC 372 0 

TCGTTTGTCT CTTCCCTTGT CTAAAAACAT CTACGGTTCA TGTGATTCAA AAACACTAAA 378 0 

AAAATATAAA CTCATTTTTT TTTAATACTT AACATTTAAA CTATATATAT ATATATATAT 3 84 0 

30 

ATATATATCT TATACTAGTC CCAAGTTTTA GTGTGAGGTT TTTTTATTCA AAATCTATCA 3 90 0 

GTACATTTTT TGGAAAAGAA CTAAGTGAAA TTTTCTCCAA ATTTTCCTTT TACTATTGAT 3 96 0 

35 TTTTTAATTA CTGGATGTCA TTAACTTTAA TCTTTTGATT CTTTCAACGT TTACCATTGG 4 02 0 

GAACCTTCAC ATGAAATAAA TGTCTACTTT ATTGAGTCAT ACCTTCGTCA ACATAAATTA 4080 

ATTGATGTTC TTCTCCAAAT TTTGAGTTTT TGGTTTTTCT AATAATCTTA ACGAAAGCTT 414 0 

40 

TTTGGTATAC ATGTAAAACG TAACGGCAAG AATCTGAACA GTCTACTCAA CGGGGTCCAT 4200 

AAGTCTAGAA TGTAGACCCC ACAAACTTAC TCTTATCTTA TTGGTCCGTA ACTAAGAACG 42 60 

45 TGTCCCTCTG ATTCTCTTGT TTTCTTCTAA TTAATTCGTA TCCTACAAAT TTAATTATCA 43 2 0 

TTTCTACTTC AACTAATCTT TTTTTATTTC CTAAAGATTT CAATTTCTCT CTGTATTTTC 4 3 80 

TATGAACAGA ATTGAACTTG GACCAGCACA GCAACAACCC AACCCCAATG ACCAGCTCAG 444 0 

50 

TCATAGTAGC CGGCGCCGGT GACAAGAACA ATGGTATCGT GGTCCAGCAG CAACCACCAT 4 500 

GTGTGGCTCG TGAGCAAGAC CAATACATGC CAATCGCAAA CGTCATAAGA ATCATGCGTA 4 56 0 

55 AAACCTTACC GTCTCACGCC AAAATCTCTG ACGACGCCAA AGAAACGATT CAAGAATGTG 4 6 20 

TCTCCGAGTA CATCAGCTTC GTGACCGGTG AAGCCAACGA GCGTTGCCAA CGTGAGCAAC 46 8 0 
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GTAAGACCAT AACTGCTGAA GATATCCTTT GGGCTATGAG CAAGCTTGGG TTCGATAACT 474 0 

ACGTGGACCC CCTCACCGTG TTCATTAACC GGTACCGTGA GATAGAGACC GATCGTGGTT 4 80 0 

5 CTGCACTTAG AGGTGAGCCA CCGTCGTTGA GACAAACCTA TGGAGGAAAT GGTATTGGGT 4 860 

TTCACGGCCC ATCTCATGGC CTACCTCCTC CGGGTCCTTA TGGTTATGGT ATGTTGGACC 4 920 

AATCCATGGT TATGGGAGGT GGTCGGTACT ACCA7VAACGG GTCGTCGGGT CAAGATGAAT 4 980 

10 

CCAGTGTTGG TGGTGGCTCT TCGTCTTCCA TTAACGGAAT GCCGGCTTTT GACCATTATG 504 0 

GTCAGTATAA GTGAAGAAGG AGTTATTCTT CATTTTTATA TCTATTCAAA ACATGTGTTT 5100 

15 CGATAGATAT TTTATTTTTA TGTCTTATCA ATAACATTTC TATATAATGT TGCTTCTTTA 516 0 

AGGAAAAGTG TTGTATGTCA ATACTTTATG AGAAACTGAT TTATATATGC AAATGATTGA 522 0 

ATCCAAACTG TTTTGTGGAT TAAACTCTAT GCAACATTAT ATATTTACAT GATCTAAAGG 52 80 

20 

TTTTGTAATT CAAAAGCTGT CATAGTTAGA AGATAACTAA ACATTGTAGT AACCAAGTTT 5 34 0 

AATTTACTTT TTTGAGTTTA CATAACTAAC CAAGCCAAAA GGTTATAAAA TCTAAATTCG 54 00 

25 TTGAGTTGTC AAACTTCTGA AGATTGCTAT CCTCTTTGAG TTGCTTTCTT TTGGGTGCTT 54 60 

GAGTTTCATT AGGCTGAGCT GACTCGTTGC TCTCTAGTCT TTCATCTCTG TCTTTTCCAA 5520 

GGATTCATAA CGTTGGTCGC TCTCTGTTTC TGCCTACACT TCTTCAAGGG ATCATTACTG 5 580 

30 

AGGCTAAGAG TTAAAGACCT GAACCATGGT TTTCTGTAAC TGGTTCAAGT TCATTCTCCG 564 0 

GTTATTGTGT GGTTATCTTT CGGTTAGATT GAAACCCATA TGTTTGCTCT GTTTCTTCTA 5700 

35 GTTCCAAGTT TAATTTCCGG TTATTGTTTG GCTTTTTAAA AGTTTTTAAG GTCTATTCTA 57 60 

TGTAAAGACT ATTCTACGTA CGTACATTTA TCGCAAAATT GAAAGATTAT AAAAAAAATT 582 0 

GAAAGATCCA AAGGAAACCA ATAGATTAAA CTAAAATGTA GTATCCTTTT TATCATTTTA 5 8 80 

40 

GGCTATGTTT TCTTTTAAGA AAGCTTTGGT AGTTAACTCT GTTTAAAAGA AAAAATU^GAG 594 0 

ATGCATAAAT TAAATTTAAG TTTCTAGAAC TTTTGGATAA ACATATTAAG CTAAAGAT^T 60 0 0 

45 TAAACTAAAG GGCGTAAATG CAAGCTTGTT ATGCGTTATT GAAAACATTA CCTCTAAATT 6 060 

AAATAGCCCA ATATTGAAAA CCTTAAGCTT CTTTGATCCC CTTAACTTGT TTGTCCACCA 612 0 
AGTATTAGTT CATCTCTTAA CACGGCAACT CGAAACGGCA CAATGGACAA ACATGGTCTT 6180 

50 

TCAAATU^CCA CTTCCCAATA CATCCATCGT CAAACTCGTG GCCACATGGT AAGGTCACCA 624 0 

CTATTTCTCC CTTTTCAAAC TCCTCCAAAC AAATTGTGCA CACACTGGCG TCAGAGTTGG 6 3 00 

55 ATTTCTTCTT ATTATTATAT ACTTTCCTTG CCAAACGGTC AACCACAAAC TTATTTGCCG 63 60 

GTCTAATTAA CTCGATATTA TTGGTGGTCT CATCAAACGA GTCAATCCGA GGAGGAGGTG 642 0 
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GAACAATGAC TTTACAGTAC ATGTAAACTA ACGTAGCACA AACTGAAGAG TCTACCATAG 64 80 

AAATCGACTT ACAGATTCGT TCAGTGAGTT GAGAGTTAGC AATGTCAACA TATTGTTCGG 6 54 0 

AGAGCCCTGC TGAGTACAAC CATTCATTCA GTTTTTTCGA GTCATTAGGG TAGGAGGATA 6 600 

TGACACCTTC GTAGTCATTG TACGAGAGAA CGAAATTTGG TGGAAGACTA ATTGATGTGT 6660 

CCGATCTTCG GGCACTTACG CAGATTTTGA ATGATCCAGC ATCTTGTGAT TTCGGTTTGA 6 72 0 

GGTCTATTTC GCCGCCAAAG GATATTTCCG CTTCCATAGC TATCPJKAGAG AAAGAAAAAT 6 7 80 

AGTGAATCCA AGGTTTAGGG TTTCTTTTCT TTGTCTTNCT TATATATAGA GGCGCTAGAT 6 84 0 

15 TGTATTAAGG ATTATACATA TATATAAGTA ATTGCAATTT GIGAGTTTAT CCTTATTCAT 6 900 

TTTTAATTTT ATTTACCTTT ATTTAGTTGA TATTGTGTCC TTTTCCTAGG TAGCATTTCC 6 960 

TTCCATCTGT GTTAATTATT AGCATTTCCT TTCCTTTGTC TTATTTGCCT TTATTTCGTA 7 020 

20 

GGAAGAAATC CTTTATGNAC CCCATCTTGG CTGAGAACTT GAGATGATTT TAAATCCTCA 7 080 

AAAATTATTC AATTTATGAT TTCGAAATTG ATATACACTT TATATTTTCT CCTAAAAAAC 714 0 

25 CATATTGTAC TAAGAAAAGT AGAAAACCAG ACTTTTTAAT ATGTTAGATT TTAATTGGGT 72 00 

TCTTAAAGTG TTTTAGCGTT TNACACCGGT TATTCTCCAA AATCCAAACT CTAT7VATTAT 72 60 

AGTTTTTAAG TATAAATTAA TCCGGTTGGC CCAATTAGTG GACCGTTTAA AGAGTAGACA 7 320 

^,prp,j,rj,^nprprp,j, TATATATCGA CTACCATAAA ACTTTAACGA TTAATATTTT TGGATAATAA 73 80 

GCGATCGTTT TGAGGCGTCC CAATTTTTTT TGTTTCTTTT TATATGAGAA ATGGGTTTAA 74 40 

35 GAAAAACTGC AATTTTGTCC ATAAAGCTAG TCAGAATTCC TGCAGCCCGG GGGATCCACT 75 0 0 

AGTTCTAGAG CGGCCGCCAC CGCGGTGGAG CTCCAATTCG CCCTATAGTG AGTCGTATTA 7 56 0 

40 (2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
45 (C) STR7VNDEDNESS : 

(D) TOPOLOGY: linear 



30 



50 



55 



(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Met Pro lie Ala Asn Val lie 
1 5 



(2) INFORMATION FOR SEQ ID NO : 6 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

lie Gin Glu Cys Val Ser Glu Tyr lie Ser Phe Val 
15 10 

15 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

25 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 
GGAATTCAGC AACAACCCAA CCCCA 



25 



10 



15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
GCTCTAGACA TACAACACTT TTCCTTA 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



27 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 
ATGACCAGCT CAGTCATAGT AGC 



23 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
GCCACACATG GTGGTTGCTG CTG 



23 



(2) INFORMATION FOR SEQ ID NO ill: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAGATAGAGA CCGATCGTGG TTC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 
TCACTTATAC TGACCATAAT GGTC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
GCATAGATGC ACTCGAAATC AGCC 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCTTGGTAAT AATTGTCATT AG 
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10 



15 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 ; 
CTAAAAACAT CTACGGTTCA 



20 
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15 



35 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
TTTGTGGTTG ACCGTTTGGC 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

30 Leu Pro lie Ala Asn Val Ala 

1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ; 

40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Met Gin Glu Cys Val Ser Glu Phe lie Ser Phe Val 
15 10 

50 
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1. An isolated nucleic acid molecule comprising a LECl polynucleotide 
sequence, which polynucleotide sequence specifically hybridizes to SEQ. ID. No. 1 under 
stringent conditions. 

2. The isolated nucleic acid molecule of claim 1 , wherein the LECl 
polynucleotide is between about 100 nucleotides and about 630 nucleotides in length. 

3 The isolated nucleic acid molecule of claim 1, wherein the LECl 
polynucleotide is SEQ. ID. No. 1. 

4. The isolated nucleic acid molecule of claim 1 , wherein the LECl 
polynucleotide encodes a LECl polypeptide of between about 50 and about 210 amino acids. 

5. The isolated nucleic acid molecule of claim 4, wherein the LECl 
polypeptide has an amino acid sequence as shown in SEQ. ID. No. 2. 

6. The isolated nucleic acid molecule of claim 1, further comprising a 
plant promoter operably linked to the LECl polynucleotide. 

7. The isolated nucleic acid molecule of claim 6, wherein the plant 
promoter is from a LECl gene. 

8. The isolated nucleic acid of claim 7, wherein the LECl gene is as 
shown in SEQ. ID. No. 3. 

9. The isolated nucleic acid of claim 7, wherein the LECl gene is as 
shown in SEQ. ID. No. 4. 

10. The isolated nucleic acid of claim 7, wherein the LECl polynucleotide 
is linked to the promoter in an antisense orientation. 
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11. An isolated nucleic acid molecule comprising a LECl polynucleotide 
sequence, which polynucleotide sequence encodes LECl polypeptide of between about 50 
and about 210 amino acids, 

5 12. The isolated nucleic acid of claim 10, wherein the LECl polypeptide 

has an amino acid sequence as shown in SEQ. ID. No. 2. 

13. A transgenic plant comprising an expression cassette containing a plant 
promoter operably linked to a heterologous LECl polynucleotide that specifically hybridizes 

10 to SEQ. ID. No. 1 under stringent conditions. 

14. The transgenic plant of claim 12, wherein the heterologous LECl 
polynucleotide encodes a LECl polypeptide. 

15 15. The transgenic plant of claim 13, wherein the LECl polypeptide is 

SEQ. ID. No. 2. 

16. The transgenic plant of claim 12, wherein the heterologous LECl 
polynucleotide is linked to the promoter in an antisense orientation. 

20 

17. The transgenic plant of claim 12, wherein the plant promoter is from a 

LECl gene. 

18. The transgenic plant of claim 16, wherein the LECl gene is as shown 
25 in SEQ. ID. No. 3. 

19. The transgenic plant of claim 12, which is a member of the genus 

Brassica. 

30 20. A method of modulating seed development in a plant, the method 

comprising introducing into the plant an expression cassette containing a plant promoter 
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operably linked to a heterologous LECi polynucleotide that specifically hybridizes to SEQ. 
ID- No. 1 under stringent conditions. 

21 . The method of claim 19, wherein the heterologous LECl 
polynucleotide encodes a LECl polypeptide. 

22. The method of claim 20, wherein the LECl polypeptide has an amino 
acid sequence as shown in SEQ. ID. No. 2. 

23. The method of claim 19, wherein the heterologous LECl 
polynucleotide is linked to the promoter in an antisense orientation. 

24. The method of claim 19, wherein the heterologous LECl 
polynucleotide is SEQ. ID. No. 1. 

25. The method of claim 19, wherein the plant promoter is from a LECl 

gene. 

26. The method of claim 19, wherein the LECl gene is as shown in SEQ. 

ID. No. 3. 

27. The method of claim 19, wherein the plant is a member of the genus 

Brassica. 

28. The method of claim 19, wherein the expression cassette is introduced 
into the plant through a sexual cross. 

29. An isolated nucleic acid molecule comprising a plant promoter that 
specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to 1998 of 
SEQ. ID. No. 3. 
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30. The isolated nucleic acid molecule of claim 28, wherein the plant 
promoter sequence consists essentially of nucleotides 1 to 1998 of SEQ. ID. No. 3. 

31. The isolated nucleic acid molecule of claim 28, wherein the plant 
5 promoter sequence is a subsequence of SEQ. ID, No. 4. 

32. The isolated nucleic acid molecule of claim 28, further comprising a 
polynucleotide sequence operably linked to the plant promoter sequence. 

10 33 . The isolated nucleic acid of claim 30, wherein the polynucleotide 

sequence operably linked to the plant promoter sequence encodes a desired polypeptide. 

34. The isolated nucleic acid molecule of claim 28, wherein the 
polynucleotide sequence is linked to the promoter in an antisense orientation. 

15 

35. A transgenic plant comprising an expression cassette containing a 
LECl promoter operably linked to a heterologous polynucleotide sequence, wherein the 
LECl promoter specifically hybridizes to SEQ. ID. No. 3 under stringent conditions. 

20 36- The transgenic plant of claim 33, wherein the polynucleotide sequence 

encodes a desired polypeptide. 



3 7 . The transgenic plant of claim 3 3 , wherein the heterologous 
25 polynucleotide sequence is linked to the LECl promoter in an antisense orientation. 

38. The transgenic plant of claim 33, wherein the LECl promoter is as 
shown in SEQ. ID. No. 3. 

30 39. The transgenic plant of claim 33, which is a member of the genus 

Brassica, 
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40. A method of targeting expression of a polynucleotide to a seed, the 
method comprising introducing into a plant an expression cassette containing a LECl 
promoter operably linked to a heterologous polynucleotide sequence, wherein the LECl 
promoter specifically hybridizes to a polynucleotide sequence consisting of nucleotides 1 to - 

5 1998 of SEQ. ID. No. 3. 

41 . The method of claim 38, wherein the heterologous polynucleotide 
sequence encodes a desired polypeptide. 

10 42. The method of claim 38, wherein the heterologous polynucleotide 

sequence is linked to the promoter in an antisense orientation. 
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